[PDF] Artificial neural network to estimate the refractive index of a liquid infiltrating a chiral sculptured thin film

Abstract

We theoretically expanded the capabilities of optical sensing based on surface plasmon resonance in a prism-coupled configuration by incorporating artificial neural networks (ANNs). We used calculations modeling the situation in which an index-matched substrate with a metal thin film and a porous chiral sculptured thin film (CSTF) deposited successively on it is affixed to the base of a triangular prism. When a fluid is brought in contact with the exposed face of the CSTF, the latter is infiltrated. As a result of infiltration, the traversal of light entering one slanted face of the prism and exiting the other slanted face of the prism is affected. We trained two ANNs with differing structures using reflectance data generated from simulations to predict the refractive index of the infiltrant fluid. The best predictions were a result of training the ANN with simpler structure. With realistic simulated-noise, the performance of this ANN is robust.

Full PDF

AArtiﬁcial neural network to estimate the refractive index of a liquid inﬁltrating a chiralsculptured thin ﬁlm

Patrick D. McAtee, a, ∗ Satish T.S. Bukkapatnam, b and Akhlesh Lakhtakia aa The Pennsylvania State University, Department of Engineering Science and Mechanics, University Park,PA 16802, USA b Texas A&M University, Department of Industrial and Systems Engineering, College Station, TX 77843,USA

Abstract

We theoretically expanded the capabilities of optical sensing based on surface plasmon resonance ina prism-coupled conﬁguration by incorporating artiﬁcial neural networks (ANNs). We used calculationsmodeling the situation in which an index-matched substrate with a metal thin ﬁlm and a porous chiralsculptured thin ﬁlm (CSTF) deposited successively on it is aﬃxed to the base of a triangular prism. Whena ﬂuid is brought in contact with the exposed face of the CSTF, the latter is inﬁltrated. As a result ofinﬁltration, the traversal of light entering one slanted face of the prism and exiting the other slanted faceof the prism is aﬀected. We trained two ANNs with diﬀering structures using reﬂectance data generatedfrom simulations to predict the refractive index of the inﬁltrant ﬂuid. The best predictions were a resultof training the ANN with simpler structure. With realistic simulated-noise, the performance of this ANNis robust.

The ability to accurately detect small concentrations of chemicals and biochemicals, whether toxic or benign,is highly prized in chemical, pharmaceutical, medical, environmental, and food industries [1,2]. Introductionof pathogens and toxins into the human body can arise from intentional contamination of essential infrastruc-ture [3] as well as from unforeseen consequences of their otherwise necessary applications [4]. Even chemicalstraditionally thought of as non-toxic in their bulk form, such as gold, could have harmful eﬀects when in-gested as nanoparticles [5–7]. Also, the concentrations of various chemicals in solutions and dispersions needto be determined in research as well as industrial laboratories [1, 8, 9].Sensors of chemicals and biosensors are designed to operate on the basis of several diﬀerent phenomena,including electrochemical [10, 11], optical [12–14], piezoelectric [15], gravimetric [16], and pyroelectric [17].Our focus here lies on optical sensors, of which several types exist [18, 19].One commonly used optical-sensing technique relies on surface plasmon resonance (SPR) which can occurwhen light interacts with free electrons at a metal/dielectric interface [8]. As a result, a surface-plasmon-polariton (SPP) wave is excited. Changing the relative permittivity of the dielectric material will change thecharacteristics of the SPP wave, thus allowing for sensing [20]. Let us consider an SPP wave propagatingalong the x axis guided by the metal/dielectric interface z = 0 and suppose that the metal and the partneringdielectric material are isotropic and homogenous. The metal of relative permittivity ε met ﬁlls the half-space z < ε diel ﬁlls the half-space z >

0. Thecomplex-valued wavenumber of the SPP wave guided by the interface z = 0 is given by q = k (cid:112) ε diel ε met / ( ε diel + ε met ) , (1)where k is the free-space wavenumber. This SPP wave is p polarized [21].In order to evoke SPR in practice, many geometric conﬁgurations have been devised to couple incidentlight to the SPP wave guided by the metal/delectric interface. The most commonly implemented conﬁgura-tion is a prism-coupled conﬁguration called the Turbadar–Kreschmann–Raether (TKR) conﬁguration [22,23],wherein a metal ﬁlm of thickness L met and a dielectric ﬁlm of thickness L diel are successively deposited ontoone face of a substrate, thus establishing a sensor chip [31]. The substrate has the same refractive index as aprism of refractive index n prism which exceeds n diel = √ ε diel , our assumption being that both n prism and n diel are real and positive. Typically, the prism has a cross-section of a 45 ◦ –90 ◦ –45 ◦ triangle. The second faceof the substrate is aﬃxed to the hypotenuse of the prism using an index-matching ﬂuid. A monochromatic1 a r X i v : . [ phy s i c s . a pp - ph ] J u l -polarized plane wave of free-space wavelength λ and intensity I is incident onto one slanted face of theprism at an angle φ with respect to the normal to that face. The refracted plane wave is incident on thesubstrate/metal interface (eﬀectively, the prism/metal interface) at an angle θ with respect to the normalto the interface; in principle, θ ∈ [0 ◦ , ◦ ). The plane wave is reﬂected and exits the other slanted face ofthe prism. The intensity I r of the exiting plane wave is measured as a function of θ by a photodetector, andthereby the reﬂectance R = I r /I is deduced as a function of θ . A sharp dip in the graph of R vs. θ indicatesthe excitation of a SPP wave at a speciﬁc angle denoted by θ SPP , provided that sin θ SPP > n diel /n prism . Thisangle is characteristic of ε diel and is related to the SPP wavenumber as follows:Re( q ) (cid:39) k n prism sin θ SPP . (2)The angle θ SPP changes with n diel . This is the principle of SPR-based sensing, with the partnering dielectricmaterial being the material whose refractive index is the quantity sensed.Only one dip indicating SPR appears at a ﬁxed free-space wavelength λ = 2 π/k , because the chosenmetal/dielectric interface can guide just one SPP wave. Only one SPP wave can be excited at a ﬁxed λ even if the partnering dielectric material is anisotropic [24, 25].Theory [20] and subsequent experiments [26–28], however, have conﬁrmed that a periodically nonhomo-geneous dielectric material, whether isotropic or anisotropic, partnering a metal in the TKR conﬁgurationcan support the existence of multiple SPP-wave modes at a ﬁxed λ . The diﬀerent SPP-wave modes havetheir peak ﬁeld within the partnering dielectric material at diﬀerent distances from the interface [20]. Ifthe periodically nonhomogeneous dielectric material is porous and is inﬁltrated by a ﬂuid of refractive index n inf , theory shows [29] and experiment has conﬁrmed [30, 31] that the locations of all SPP-wave modes inthe graph of R vs. θ shift with n inf , thereby enabling optical-sensing applications.In addition to SPP-wave modes, waveguide modes [32, 33] may also manifest in the graph of R vs. θ .These waveguide modes diﬀer from SPP-wave modes in that they can be bound to more than one interface,and therefore depend on the thickness of the partnering dielectric material [34].There could even exist signatures of undiscovered phenomena in the graph of R vs. θ . Therefore, whendeducing what n inf generated a speciﬁc graph of R vs. θ , it is best to make use of all the features of thegraph. This calls for the use of artiﬁcial neural networks (ANNs), which do not require understanding of thevarious underlying phenomena to discern a complicated quantitative relationship between them [36].Currently, ANNs and other machine-learning algorithms are being applied to a multitude of tasks inengineering and medicine [37–46], including inverse optical design [47, 48]. Although ANNs and machinelearning have been applied to SPR biosensors [49–53], their use in scenarios to exploit the excitation ofmultiple guided-wave modes (including multiple SPP wave-modes and waveguide modes) as well as of otherpolarization-dependent features in the graph of R vs. θ provides a novel avenue for optical sensing.The plan of this paper is as follows. Section 2 provides brief introductions to: (i) the calculation ofreﬂectances when a porous chiral sculptured thin ﬁlm (CSTF) [29,30,54,55] is used as the partnering dielectricmaterial in the TKR conﬁguration, and (ii) our application of ANNs to that optical-sensing scenario [56]. InSec. 3, we give the parameters of two diﬀerent ANNs devised by us and the data used to train and test eachANN. Section 4 details the performance of each ANN, and Sec. 5 contains a discussion of the implicationsof the numerical results. Suppose that a CSTF of thickness L diel is used as the partnering dielectric material in the TKR conﬁgurationand the planar metal/prism interface is identiﬁed as the plane z = 0. The CSTF comprises closely nestednanohelixes of a dielectric material that were grown parallel to each other by the process of physical vapordeposition. The CSTF is inﬁltrated by a ﬂuid of refractive index n inf which also ﬁlls the half space z >L met + L diel . The prism material is taken to ﬁll the half space z <

0. A schematic is provided in Fig. 1.2igure 1: Schematic of the TKR conﬁguration used to calculate R vs. θ when the partnering dielectricmaterial is a CSTF inﬁltrated by a ﬂuid of refractive index n inf . All layers extend inﬁnitely transverse tothe z axis. The ﬂuid extends to + ∞ in the z direction and the prism extends to −∞ in z direction.The anisotropy and nonhomogeneity of the CSTF are macroscopically quantiﬁed by the relative permit-tivity dyadic [29, 54] ε diel ( z ) = S z ( z ) • S y ( χ ) • ε oref • S − ( χ ) • S − ( z ) . (3)Here, the local relative permittivity dyadic ε oref = ˆ u x ˆ u x ε b + ˆ u y ˆ u y ε c + ˆ u z ˆ u z ε a (4)in the material frame captures the local orthorhombicity of the CSTF; the dyadic S z ( z ) = ˆ u z ˆ u z + (ˆ u x ˆ u x + ˆ u y ˆ u y ) cos (cid:16) πz Ω (cid:17) + h (ˆ u y ˆ u x − ˆ u x ˆ u y ) sin (cid:16) πz Ω (cid:17) . (5)captures the rotation of the local relative permittivity dyadic ε ref = S y ( χ ) • ε oref • S − ( χ ) in the laboratoryframe about the z axis, with h ∈ {− , } denoting the structural handedness and 2Ω the period; and thedyadic S y ( χ ) = ˆ u y ˆ u y + (ˆ u x ˆ u x + ˆ u z ˆ u z ) cos χ + (ˆ u z ˆ u x − ˆ u x ˆ u z ) sin χ (6)represents the locally aciculate morphology of the CSTF, with χ > xy plane. Both ε ref and ε oref have the same eigenvalues—denoted by ε a , ε b , and ε c —but their eigenvectorsdiﬀer. All three parameters ε a , b , c of the ﬂuid-inﬁltrated CSTF depend not only on λ but also on n inf (whichis itself dependent on λ ) [54]. If ε a , b , c are known for n inf = 1 at a speciﬁc value of λ , then a combination ofinverse and forward Bruggeman homogenization formalisms can be used to deduce their values for n inf (cid:54) = 1at the same λ [29].The electric ﬁeld phasor of the plane wave incident on the prism/metal interface can be written as [54] E inc ( r ) = { a s ( − ˆ u x sin ψ + ˆ u y cos ψ ) + a p [ − (ˆ u x cos ψ + ˆ u y sin ψ ) cos θ + ˆ u z sin θ ] }× exp [ ik n prism ( x cos ψ + y sin ψ ) sin θ ] exp ( ik n prism z cos θ ) , (7)3here i = √− a s is the amplitude of the s -polarized component and a p of the p -polarized component. Theincidence direction is speciﬁed by the angles θ ∈ [0 ◦ , ◦ ) and ψ ∈ [0 ◦ , ◦ ). The electric ﬁeld phasor of thereﬂected plane wave can be written as [54] E ref ( r ) = { r s ( − ˆ u x sin ψ + ˆ u y cos ψ ) + r p [(ˆ u x cos ψ + ˆ u y sin ψ ) cos θ + ˆ u z sin θ ] }× exp [ ik n prism ( x cos ψ + y sin ψ ) sin θ ] exp ( − ik n prism z cos θ ) , (8)where r s is the amplitude of the s -polarized component and r p of the p -polarized component. The reﬂectanceis deﬁned as R = | r s | + | r p | | a s | + | a p | . (9)The procedure to compute r s and r p , and therefore R , from a s and a p as functions of λ , θ , and ψ isavailable in detail elsewhere [20]. The procedure to obtain ε a , b , c as functions of λ and n inf is also availablein detail elsewhere [29, 54]. ANNs are machine-learning algorithms of a speciﬁc type [36]. Machine-learning algorithms are constructedsuch that they improve in performance over time for a speciﬁc task without being explicitly programmedfor that task. Typically, an ANN comprises multiple nodes (neurons) organized in several layers arrangedin a hierarchy. Each node in a given layer is interconnected with all the nodes in the adjacent layers. Theseinterconnections are represented as numerical values called weights. The designated ﬁrst layer serves as the input layer and the designated last layer as the output layer . Each node in a given layer computes the linearcombination of the value of each node in the previous layer, along with that node’s weight. In order toaccount for possible non-linear processes, the linear combination is fed into an activation function, such asthe sigmoid or rectiﬁer function. A neuron with no activation function is called a linear neuron, and an ANNcomposed exclusively of linear neurons will ﬁt a linear model to the data.ANNs require training and testing before implementation. An ANN is trained on a data set R of columnvectors denoted by R j , j ∈ [1 , J ]. R j consists of K scalars R jk , k ∈ [1 , K ]. In addition, there are labelvectors n j , each consisting of L scalars n j(cid:96) , (cid:96) ∈ [1 , L ]. Every R j is accompanied by a unique n j , but someof the labels may share the same value when considering noisy data. As an example, two diﬀerent sensorchips employed in the TKR conﬁguration may produce slightly diﬀerent reﬂectance signatures given thesame inﬁltrating liquid, due to diﬀerences in sensor-chip quality from the fabrication process.For a speciﬁc j = j (cid:48) , R j (cid:48) is fed into the input layer, R j (cid:48) k being fed to the node labeled k in that layer.With the weights randomized, the label vector n (cid:63)j (cid:48) is predicted by the ANN and an error value is calculatedbased on some predeﬁned error function of n j and n (cid:63)j . In general, n (cid:63)j (cid:54) = n (cid:63)j (cid:48) . This error function can beexpressed as a function of the weights, since n (cid:63)j is a function of the weights. Typically, an ANN uses agradient-descent method [35] to ﬁnd the weights such that the average error over R is suﬃciently small.The ANN is then tested on a data set R .In this paper, every testing vector and its elements are identiﬁed by the addition of an overbar tothe symbol for the corresponding training vector and its elements. In addition, we use R ab with a and b as placeholders to denote the polarization state (represented by the ratio of a s to a p ) and the angle ψ , respectively, of the incident plane wave for which the reﬂectance data are obtained. This notation isexplained in Tables 1 and 2. Next, all styles and fonts of uppercase ‘R’ represent reﬂectance data calculatedfor various polarization states, angles θ , and angles ψ . Finally, all styles and fonts of lowercase ‘n’ representrefractive-index data (i.e., n inf ) corresponding to the reﬂectance data. Let us note that n inf denotes therefractive index of the inﬁltrating liquid in general, whether used for training, testing, both, or neither.Theorems of machine learning suggest that an algorithm tailored to the needs of the speciﬁc applicationmust be sought [57, 58]. Therefore, we trained two ANNs with diﬀering structures for various R with labelvectors n j and tested the ANNs for various R with the label vectors n j . For this work, both n j and n j comprise just one element each (i.e., L = 1) and therefore are denoted simply as n j and n j , respectively.4able 1: Label a and polarization state of the incident plane wave in the prism. a a s a p polarization state1 0 1 linear ( p )2 1 0 linear ( s )3 1 / √ / √ p and s )4 1 / √ − / √ p and s )5 1 / √ i/ √ / √ − i/ √ b and angle ψ chosen for the incident plane wave in the prism. b ψ (deg)1 02 183 364 545 726 90 After ﬁxing λ = 635 nm, various R were calculated using a CSTF with half-period Ω = 200 nm and thickness L diel = 1200 nm made of titanium oxide. When n inf = 1, we set ε a = 2 . ε b = 3 . ε c = 2 . χ = 37 . ◦ , as provided by actual measurements on columnar thin ﬁlms [59]. Valuesof ε a , b , c for n inf > n met = √ ε met = 0 . i . L met = 30 nm [30]. Theprism was chosen to be made of SF11 glass so that n prism = 1 . n water = 1 . R and R for n inf ∈ [1 . , .

4] for equal increments∆ n inf = 0 . n inf ≈ . J . All R and R were calculated for φ ∈ [ − ◦ , . ◦ ] in steps of ∆ φ = 0 . ◦ , which can be related [31] to θ by the standard law of refraction,yielding θ = 45 ◦ + sin − (cid:16) n − sin φ (cid:17) for the prism with a 45 ◦ –90 ◦ –45 ◦ triangle as its cross section. Alldata were computed using Mathematica R (cid:13) running on a laptop computer with a 4-core 2.4-GHz processorand 16 GB of RAM. With φ , n inf , and b ﬁxed, the average estimated computation time was ∼ .

06 s for theset of six polarization states speciﬁed in Table 1.

SPP-wave modes are capable of being excited by incident light of an arbitrary polarization state in theprism-coupled conﬁguration when a periodically nonhomogeneous dielectric material is partnering a metalthin ﬁlm [20]. However, experiments have shown that using p -polarized incident light results in a higher5ensitivity over s -polarized incident light, based on the deﬁnition [31] ρ = θ SPP ( n inf ) − θ SPP ( n inf ) n inf − n inf (10)of sensitivity, where n inf (cid:96) , (cid:96) ∈ { , } , is the refractive index of an inﬁltrant ﬂuid labeled (cid:96) and θ SPP ( n inf (cid:96) )is the value of θ SPP for n inf (cid:96) . One might then infer that R from p -polarized incident light will yield bettertraining for an ANN. However, since ρ is focused on SPP-wave modes but not on waveguide modes and otherphenomena, as mentioned in Sec. 1, we conjectured that s -polarized incident light as well as incident lightof other polarization states may also be useful for ANN training.Therefore, our ﬁrst group of simulations used R calculated for six diﬀerent polarization states of incidentlight with ψ = 0 ◦ . Consistently with Tables 1 and 2, the six training data sets are denoted by R a , a ∈ [1 , J = 10 and K = 250; thus, 10 values of n j and 250 values of θ were used. For R a , a ∈ [1 , J = 200 and K = 250. ψ To determine what value of the angle ψ is the most eﬀective, training data sets R b , b ∈ [1 , J = 10 and K = 250. For the counterpart testing datasets R b , b ∈ [1 , J = 200 and K = 250. Even if one polarization state is more eﬀective than the others, the ANN still may beneﬁt from the inclusionof reﬂectance data from the other polarization states. We use the notation R cb to mean reﬂectance datathat includes polarization states labeled a ∈ { , , ...., c } ; thus, R cb = R b ∪ R b ∪ · · · ∪ R cb and R ≡ R .Fixing ψ = 0 ◦ , we focused on R c . For R , R , R , R , R , and R , we have J = 10 and K =250 , , , , J = 200 for R c , K = 250 , , , , c = 1 , , , ,

5, and 6, respectively. ψ With similar reasoning as in Sec. 3.1.3, we write R a c to mean reﬂectance data that includes values of ψ corresponding to b ∈ { , , ...., c } ; thus, R a c = R a ∪ R a ∪ · · · ∪ R ac and R ≡ R ≡ R . Fixing our attentiononly on incident p -polarized light, we focused on R c . For R , R , R , R , R , and R , J = 10 andand K = 250 , , , , R c was constituted with J = 200 and K = 250 , , , , c = 1 , , , ,

5, and 6, respectively. ψ Of the multitude of reﬂectance data sets that can be determined for combinations of polarization states and ψ , we only investigated R = (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) ∪ (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) ∪ · · · ∪ (cid:0) R ∪ R ∪ · · · ∪ R (cid:1) . (11)For this training data set, J = 10 and K = 9000. For the corresponding testing data set R , J = 200 and K = 9000. For the data set that yielded the best ANN performance, noise was added to that set to simulate experimentaldata. The corresponding noisy data set is denoted by R with the speciﬁc superscript and subscript of theset that yielded the best performance. As shown in Sec. 4, the best-performance set was R . The noise was6igure 2: Reﬂectance as a function of θ measured for four diﬀerent samples in the TKR conﬁguration, eachwith a 30-nm-thick aluminum ﬁlm and air as the partnering dielectric material. Each data point from agiven plot was compared to the data point for the same θ from all other plots. The absolute diﬀerence of allthese measurements was averaged and calculated to be 0 . .

05. This magnitude was split about 0, yielding − .

025 for the lower bound and 0 .

025 for the upper boundof random numbers used to add noise to R .simulated as a random number between − .

025 and 0 .

025 added to each element R jk ∈ R j ∈ R for 10diﬀerent instances. The upper and lower bounds of the random numbers were chosen based on the averageabsolute diﬀerence of reﬂectance data between four diﬀerent 30-nm-thick aluminum ﬁlms implemented ina speciﬁc TKR apparatus, as shown in Fig. 2 . Thus, for R , J = 10 ×

10 = 100 and K = 250. For thecorresponding testing data set R , J = 200 ×

10 = 2000 and K = 250. Up until this stage, each training and testing data set has included reﬂectance data from θ ranging fromabout 30 ◦ to about 60 ◦ . With inspiration from conventional SPR occurring at the interface of a metal anda homogeneous dielectric material, we devised a data set denoted by R SP R with θ ranging between 0 ◦ and70 ◦ in steps of 0 . ◦ , with a = 1, b = 1, J = 10, K = 701, and with no noise added. In order to ensure thatonly the occurrences of SPR were captured in R SP R , the reﬂectances were calculated with L diel = 1200 nmand L diel = 1600 nm, and only those dips that diﬀered by less than 0 . ◦ for the two values of L diel wereretained [30, 31]. Thus most entries in the column vectors in R SP R were zero. The corresponding testingdata set R SP R was constituted with J = 200, and K = 701. This data set would allow us to to determinethe relative eﬃcacy of SPR alone for training ANNs. All training and testing was done using MATLAB R (cid:13) running on a laptop computer with a 4-core 2.4-GHzprocessor and 16 GB of RAM. In all, training took no more than a day. Each training data set was usedfor two separate ANN structures. The ﬁrst type of ANN structure, denoted by ANN , consisted of threelayers: an input layer, one hidden layer, and an output layer. The hidden layer contained 100 nodes with no7ctivation function (linear neurons) and the output layer contained one node. The second ANN structure,denoted by ANN consisted of four layers: an input layer, two hidden layers, and an output layer. The twohidden layers each contained 100 nodes with rectiﬁer activation functions, and the output layer contained onenode. The input layer for both ANN and ANN consisted of one vector. The size of this vector was either250, 500, 750, 1000, 1250, 1500, 9000, or 701 depending on the speciﬁc data set being used for training. Thestochastic gradient-descent with momentum method [61] was chosen for optimization with an initial learningrate of 0 .

01 and the error function deﬁned as the mean-squared error. The number of maximum epochs was10 , µ , standard deviation δ , median σ , maximum M , and minimum m of | n (cid:63)j − n j | for each testingdata set were used to assess the performance of each ANN. In addition, for several instances of trainingfor any given ANN with a particular structure and training data set, the performance of that ANN giventhe same testing data set may vary due to the fact that the weights are randomized at the start of eachtraining instance. Therefore, given a particular training data set, we trained each ANN for ten instancesand averaged the aforementioned statistical measures for all of the training instances. Thus, hereafter, thesymbols µ , δ , σ , M , and m , denote performance measures averaged over ten training instances. Ideally, allﬁve performance measures should be as close to 0 as possible. Values of all ﬁve performance measures for every training data set for ANN are listed in Table 3. This tableis divided into six blocks, one each for R a , R b , R c , R c , R , and R (1 , , , , ,

6, respectively). Withineach of the ﬁrst four blocks, those training sets which yielded the lowest value for each performance measureare highlighted by a colored background. In addition, the overall lowest value for each performance measureis identiﬁed in boldface. 8able 4: Values for µ , δ , σ , M , and m for ANN . Note that R , R , and R are equivalent by virtue ofthe deﬁnitions in Sec. 3.1. µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) µ (10 − ) δ (10 − ) σ (10 − ) M (10 − ) m (10 − ) R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . . R . . . . R . . . . . R . . R . . . . . R . . . . . and ANN Training of ANN for each R provided a linear function to predict n inf . By having three out of ﬁve perfor-mance measures as the lowest, we may deﬁne R as providing the best training for ANN . We also notethat changing ψ does not change the performance of ANN as much as changing the polarization state. Thesame statement goes for the addition of more ψ values versus the addition of more polarization states.Training of ANN for each R provided a non-linear function to predict n inf . By having 4 out of 5performance measures being the lowest, we may deﬁne R as providing the best training for ANN . If wecompare ANN and R with ANN and R , the latter pair perform better, despite the former training setcontaining more data and the latter ANN having a simpler structure.Training of ANN with R SP R yielded µ = 220 . × − , δ = 156 . × − , σ = 211 . × − , m = 0 . × − , and M = 507 . × − , while training of ANN with R SP R yielded µ = 251 . × − , δ = 145 . × − , σ = 251 . × − , m = 2 . × − , and M = 499 . × − . Overall, bothsets of measures are signiﬁcantly worse compared to training with any other R . We conclude from this thatsensing only with SPR data seriously undermines the ability of an ANN to correctly predict n inf .When noise consistent with actual experimental data was added, we found that the performance of asimple ANN with a small number of training examples yielded no more than a 0 . n inf based on the fact that M = 36 . × − for ANN trained with R . Thus, useof ANNs provides a level of immunity to measurement noise. A typical method for sensing is via SPR using the TKR conﬁguration. We have simulated reﬂectance datafrom a liquid-inﬁltrated CSTF partnering a metal thin-ﬁlm in the TKR conﬁguration and used this datato train two ANNs with diﬀering structures. The performance measures of the ANNs for many traininginstance were compared. The various training data sets contained reﬂectance data calculated for variouscombinations of the polarization state and the angle ψ . Some of this training data was complicated byrealistic noise. We stress here that our work pertains directly to the sensing of n inf and only indirectly tothe identiﬁcation of the inﬁltrant ﬂuid, functionalization [8] being required for the latter purpose.One main conclusion we have shown is that n inf can best be predicted from p -polarized light with ψ = 90 ◦ ,with an ANN having no activation function. This instance represents a best-case scenario. It will requirethe use of either a triangular prism with a very broad base or a hemispherical prism because the value of ψ is very high.Another main conclusion of this paper is that the inclusion of other reﬂectance data in addition to SPR9ata greatly improves the performance of an ANN. Given the simplicity and heuristic choice of the ANN structure and the relative small number of training examples compared to the testing examples used for thiswork, we are optimistic that signiﬁcant improvement in the performance can be achieved in the future byadding more training examples and reﬁning the ANN structure. Eventually, the application of ANNs mayengender an era of simultaneous multianalyte sensing [30]. We also expect our ANN methodology to applywhen SPP waves are manipulated by active functional materials and phase-change materials for enhancedsensitivity [62, 63]. Appendix A: MATLAB R (cid:13) codes for Artiﬁcial Neural Networks The variables n and mb represent the input layer size and mini-batch size, respectively. The mini-batch sizewas that of the training data set used for that instance. A.1: ANN layers = [ ...sequenceInputLayer(n)fullyConnectedLayer(100)fullyConnectedLayer(1)regressionLayer];options = trainingOptions(’sgdm’,’InitialLearnRate’,0.01, ...’MaxEpochs’,10000,...’MiniBatchSize’,mb) A.2: ANN layers = [ ...sequenceInputLayer(n)fullyConnectedLayer(100)reluLayerfullyConnectedLayer(100)reluLayerfullyConnectedLayer(1)regressionLayer];options = trainingOptions(’sgdm’,’InitialLearnRate’,0.01, ...’MaxEpochs’,10000,...’MiniBatchSize’,mb) Acknowledgments.

The research of P. D. McAtee and A. Lakhtakia is funded by the Charles GodfreyBinder Endowment at the Pennsylvania State University.

References [1] C. McDonagh, C. S. Burke, and B. D. MacCraith, “Optical chemical sensors,”

Chemical Reviews ,400–422 (2008). [doi: 10.1021/cr068102g].[2] A. P. F. Turner, “Biosensors: sense and sensibility,”

Chemical Society Reviews , 3184–3196 (2013).[doi: 10.1039/c3cs35528d]. 103] D. V. Lim, J. M. Simpson, E. A. Kearns, and M. F. Kramer, “Current and developing technologies formonitoring agents of bioterrorism and biowarfare,” Clinical Microbiology Reviews , 583–607 (2005).[doi: 10.1128/CMR.18.4.583-607.2005].[4] N. Verma and A. Bhardwaj, “Biosensor technology for pesticides—A review,” Applied Biochemistry andBiotechnology , 3093–3119 (2014). [doi: 10.1007/s12010-015-1489-2].[5] A. Orlando, M. Colombo, D. Prosperi, F. Corsi, A. Panariti, I. Rivolta, M. Masserini, and E. Cazzaniga,“Evaluation of gold nanoparticles biocompatibility: a multiparametric study on cultured endothelial cellsand macrophages,”

Journal of Nanoparticle Research , 58 (2016). [doi: 10.1007/s11051-016-3359-4].[6] I. Fratoddi, I. Venditti, C. Cametti, and M. V. Russo, “How toxic are gold nanoparticles? The state ofthe art,” Nano Research , 1771–1799 (2015). [doi: 10.1007/s12274-014-0697-3].[7] A. M. Alkilany and C. J. Murphy, “Toxicity and cellular uptake of gold nanoparticles: what have welearned so far?,” Journal of Nanoparticle Research , 2313–2333 (2010). [doi: 10.1007/s11051-010-9911-8].[8] J. Homola, “Surface plasmon resonance sensors for detection of chemical and biological species,” Chem-ical Reviews , 462–493 (2008). [doi: 10.1021/cr068107d].[9] H. Malekzad, P. S. Zangabad, H. Mohammad, M. Sadroddini, Z. Jafari, N. Mahlooji, S. Abbaspour,S. Gholami, M. G. Houshangi, R. Pashazadeh, A. Beyzavi, M. Karimi, and M. R. Hamblin, “Noblemetal nanostructures in optical biosensors: Basics, and their introduction to anti-doping detection,”

Trends in Analytical Chemistry , 116–135 (2018). [doi: 10.1016/j.trac.2017.12.006].[10] J. R. Stetter and J. Li, “Amperometric gas sensors—A review,”

Chemical Reviews , 352–366 (2008).[doi: 10.1021/cr0681039].[11] S. Cosnier,

Electrochemical Biosensors , Pan Stanford, New York (2015).[12] I. Abdulhalim, M. Zourob, and A. Lakhtakia, “Surface plasmon resonance for biosensing: A mini-review,”

Electromagnetics , 214–242 (2008). [doi: 10.1080/02726340801921650].[13] T. Taliercio, F. G.-P. Flores, F. B. Barho, M. J. Milla–Rodrigo, M. Bomers, L. Cerutti, and E. Tourni´e,“Plasmonic bio-sensing based on highly doped semiconductors,” Proceedings of SPIE , 103530S(2017). [doi: 10.1117/12.2274303].[14] M. Arjmand, H. Saghaﬁfar, M. Alijanianzadeh, and M. Soltanolkotabi, “A sensitive tapered-ﬁber opticbiosensor for the label-free detection of organophosphate pesticides,”

Sensors and Actuators B: Chemical , 523–532 (2017). [doi: 10.1016/j.snb.2017.04.121].[15] P. Skl´adal, “Piezoelectric biosensors,”

Trends in Analytical Chemistry , 127–133 (2016). [doi:10.1016/j.trac.2015.12.009].[16] M. DeMiguel–Ramos, B. D´ıaz–Dur´an, J.-M. Escolano, M. Barba, T. Mirea, J. Olivares, M. Clement,and E. Iborra, “Gravimetric biosensor based on a 1.3 GHz AlN shear-mode solidly mounted resonator,” Sensors and Actuators B: Chemical , 1282–1288 (2017). [doi: 10.1016/j.snb.2016.09.079].[17] A. Davidson, A. Buis, and I. Glesk, “Toward novel wearable pyroelectric temperature sensor for medicalapplications,”

IEEE Sensors Journal , 6682–6689 (2017). [doi:10.1109/JSEN.2017.2744181].[18] A. Rasooly and K. E. Herold (eds.), Biosensors and Biodetection: Methods and Protocols, Volume 503:Optical-Based Detectors , Humana Press, New York (2009).[19] M. Zourob and A. Lakhtakia (eds.),

Optical Guided-wave Chemical and Biosensors, Vols. 1 and 2 ,Springer, Heidelberg, Germany (2010). 1120] J. A. Polo Jr., T. G. Mackay, and A. Lakhtakia,

Electromagnetic Surface Waves: A Modern Perspective ,Elsevier, Waltham, Massachusetts (2013).[21] H. J. Simon, D. E. Mitchell, and J. G. Watson, “Surface plasmons in silver ﬁlms—a novel undergraduateexperiment,”

American Journal of Physics , 630–636 (1975). [doi: 10.1119/1.9764].[22] T. Turbadar, “Complete absorption of light by thin metal ﬁlms,” Proceedings of the Physical Society , 40–44 (1959). [doi: 10.1088/0370-1328/73/1/307].[23] E. Kretschmann and H. Raether, “Radiative decay of non radiative surface plasmons excited by light,” Zeitschrift f¨ur Naturforschung A , 2135–2136 (1968). [doi: 10.1515/zna-1968-1247].[24] G. J. Sprokel, “The reﬂectivity of a liquid crystal cell in a surface plasmon experiment,” MolecularCrystals and Liquid Crystals , 39–45 (1981). [doi: 10.1080/00268948108073551].[25] G. J. Sprokel, R. Santo, and J. D. Swalen, “Determination of the surface tilt angle by attenuated totalreﬂection,” Molecular Crystals and Liquid Crystals , 29–38 (1981). [doi:10.1080/00268948108073550].[26] Devender, D. P Pulsifer, and A. Lakhtakia, “Multiple surface plasmon polariton waves,” ElectronicsLetters , 1137–1138 (2009). [doi:10.1117/1.3249629].[27] A. Lakhtakia, Y.-J. Jen, and C.-F. Lin, “Multiple trains of same-color surface plasmon-polaritons guidedby the planar interface of a metal and a sculptured nematic thin ﬁlm. Part III: Experimental evidence,” Journal of Nanophotonics , 033506 (2009). [doi: 10.1117/1.3249629].[28] T. H. Gilani, N. Dushkina, W. L. Freeman, M. Z. Numan, D. N. Talwar, and D. P. Pulsifer, “Surfaceplasmon resonance due to the interface of a metal and a chiral sculptured thin ﬁlm,” Optical Engineering , 120503 (2010). [doi: 10.1117/1.3525282].[29] T. G. Mackay and A. Lakhtakia, “Modeling chiral sculptured thin ﬁlms as platforms forsurface-plasmonic-polaritonic optical sensing,” IEEE Sensors Journal , 273–280 (2012). [doi:10.1109/JSEN.2010.2067448].[30] S. E. Swiontek, D. P. Pulsifer, and A. Lakhtakia, “Optical sensing of analytes in aqueous solu-tions with multiple surface-plasmon-polariton-wave platform,” Scientiﬁc Reports , 1409 (2013). [doi:10.1038/srep01409].[31] S. E. Swiontek and A. Lakhtakia, “Inﬂuence of silver-nanoparticle layer in a chiral sculptured thinﬁlm for surface-multiplasmonic sensing of analytes in aqueous solution,” Journal of Nanophotonics ,033008 (2016). [doi: 10.1117/1.JNP.10.033008].[32] D. Marcuse, Theory of Dielectric Optical Waveguides , Academic Press, San Diego, California (1991).[33] T. Khaleque and R. Magnusson, “Light management through guided-mode resonances in thin-ﬁlm siliconsolar cells,”

Journal of Nanophotonics , 083995 (2014). [doi: 10.1117/1.JNP.8.083995].[34] L. Liu, M. Faryad, A. S. Hall, G. D. Barber, S. Erten, T. E. Mallouk, A. Lakhtakia, and T. S.Mayer, “Experimental excitation of multiple surface plasmon-polariton waves and waveguide modesin a one-dimensional photonic crystal atop a two-dimensional metal grating,” Journal of Nanophotonics , 093593 (2015). [doi: 10.1117/1.JNP.9.093593].[35] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propa-gation,” In: D. E. Rumelhart and J. L. McClelland (eds), Parallel Distributed Processing: Explorationsin the Microstructure of Cognition. Volume 1: Foundations , pp. 318–362, MIT Press, Cambridge, Mas-sachusetts (1986).[36] I. Goodfellow, Y. Bengio, and A. Courville,

Deep Learning , MIT Press, Cambridge, Massachusetts(2016). 1237] S. T. S. Bukkapatnam, A. Lakhtakia, and S. R. T. Kumara, “Chaotic neurons for on-line qualitycontrol in manufacturing,”

International Journal of Advanced Manufacturing Technology , 95–100(1997). [doi: 10.1007/BF01225755].[38] S. T. S. Bukkapatnam, S. R. T. Kumara, and A. Lakhtakia, “Fractal estimation of ﬂank wear in turning,” Journal of Dynamic Systems, Measurement, and Control , 89–94 (2000). [doi: 10.1115/1.482446].[39] S. T. S. Bukkapatnam, S. R. T. Kumara, and A. Lakhtakia, “Analysis of acoustic emission sig-nals in machining,”

Journal of Manufacturing Science and Engineering , 568–576 (1999). [doi:10.1115/1.2833058].[40] S. T. S. Bukkapatnam, A. Lakhtakia, and S. R. T. Kumara, “Analysis of sensor signals showsturning on a lathe exhibits low-dimensional chaos,”

Physical Review E , 2375–2387 (1995).[doi:10.1103/PhysRevE.52.2375].[41] J. J. Braun, Y. Glina, J. K. Su, and T. J. Dasey, “Computational intelligence in biological sensing,”

Proceedings of SPIE , 111–122 (2004). [doi: 10.1117/12.541046].[42] S. Chakrabartty and Y. Liu, “Towards reliable multi-pathogen biosensors using high-dimensional en-coding and decoding techniques,”

Proceedings of SPIE , 703514 (2008). [doi: 10.1117/12.799358].[43] V. A. Saetchnikov, E. A. Tcherniavskaia, G. Schweiger, and A. Ostendorf, “Classiﬁcation of antibi-otics by neural network analysis of optical resonance data of whispering gallery modes in dielectricmicrospheres,”

Proceedings of SPIE , 84240Q (2012). [doi: 10.1117/12.920397].[44] P. H. Rogers, K. D. Benkstein, and S. Semancik, “Machine learning applied to chemical analysis:Sensing multiple biomarkers in simulated breath using a temperature-pulsed electronic-nose,”

AnalyticalChemistry , 9774–9781 (2012). [doi: 10.1021/ac301687j].[45] N. Maleki, S. Kashanian, E. Maleki, and M. Nazari, “A novel enzyme based biosensor for catecholdetection in water samples using artiﬁcial neural network,” Biochemical Engineering Journal , 1–11(2017). [doi: 10.1016/j.bej.2017.09.005].[46] K. N. Mutter, “Hopﬁeld neural network and optical ﬁber sensor as intelligent heart rate monitor,”

Proceedings of SPIE , 104564T (2018). [doi: 10.1117/12.2283012].[47] W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,”

ACSNano , 6326–6334 (2018). [doi: 10.1021/acsnano.8b03569].[48] D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design ofnanophotonics structures,” ACS Photonics , 1365–1369 (2018). [doi: 10.1021/acsphotonics.7b01377].[49] M. R. H. Nezhad, J. Tashkhourian, J. Khodaveisi, and M. R. Khoshi, “Simultaneous colorimetricdetermination of dopamine and ascorbic acid based on the surface plasmon resonance band of colloidalsilver nanoparticles using artiﬁcial neural networks,” Analytical Methods , 1263–1269 (2010). [doi:10.1039/C0AY00302F].[50] J. Ma, Y. Cao, K. Liu, X. Huang, J. Jiang, T. Wang, M. Xue, P. Chang, and T. Liu, ‘ “A simpledemodulation algorithm for optical SPR sensor based on all-phase low-pass ﬁlters,” Proceedings ofSPIE , 106180N (2018). [doi: 10.1117/12.2281236].[51] J. Khodaveisi, S. Dadfarnia, A. M. H. Shabani, M. R. Moghadam, and M. R. H. Nezhad, “Artiﬁcialneural network assisted kinetic spectrophotometric technique for simultaneous determination of parac-etamol and p-aminophenol in pharmaceutical samples using localized surface plasmon resonance bandof silver nanoparticles,”

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy ,474–480 (2015). [doi: 10.1016/j.saa.2014.11.094].1352] S. Yu, J. Wang, T. Zhang, R. Zhou, J. Dai, Y. Zhou, and K. Xu, “Performance optimization forplasmonic refractive index sensor based on machine learning ,”

Proceedings of SPIE , 110482X(2019). [doi: 10.1117/12.2519699].[53] F. Bahrami, M. Maisonneuve, M. Meunier, J. S. Aitchison, and M. Mojahedi, “An improved refractiveindex sensor based on genetic optimization of plasmon waveguide resonance,”

Optics Express , 20863–20872 (2013). [doi: 10.1364/OE.21.020863].[54] A. Lakhtakia, “Enhancement of optical activity of chiral sculptured thin ﬁlms by suitable inﬁltration ofvoid regions,” Optik , 145–148 (2001). [doi: 10.1078/0030-4026-00024].[55] A. Lakhtakia, “Erratum: Enhancement of optical activity of chiral sculptured thin ﬁlms by suitableinﬁltration of void regions,”

Optik , 544 (2001). [doi: 10.1078/0030-4026-00024].[56] P. D. McAtee, S. T. S. Bukkapatnam, and A. Lakhtakia, “Artiﬁcial neural network to predict therefractive index of a liquid inﬁltrating a chiral sculptured thin ﬁlm,”

Proceedings of SPIE ,107280G (2018). [doi: 10.1117/12.2321355].[57] D. H. Wolpert, “The lack of a priori distinctions between learning algorithms,”

Neural Computation ,1341–1390 (1996). [doi: 10.1162/neco.1996.8.7.1341].[58] D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Transactions onEvolutionary Computation , 67–82 (1997). [doi: 10.1109/4235.585893].[59] I. Hodgkinson, Q. h. Wu, and J. Hazel, “Empirical equations for the principal refractive indices andcolumn angle of obliquely deposited ﬁlms of tantalum oxide, titanium oxide, and zirconium oxide,” Applied Optics , 2653–2659 (1998). [doi: 10.1364/AO.37.002653].[60] https://refractiveindex.info/?shelf=main&book=Au&page=Johnson (accessed on July 11 2018).[61] J. Patterson and A. Gibson, Deep Learning , O’Reilly Media, Sebastopol, California (2017).[62] D. Rodrigo, O. Limaj, D. Janner, D. Etezadi, F. J. Garc´ıa de Abajo, V. Pruneri, and H. Altug,“Mid-infrared plasmonic biosensing with graphene,”

Science , 165–168 (2015). [doi: 10.1126/sci-ence.aab2051].[63] K. V. Sreekanth, Q. Ouyang, S. Sreejith, S. Zeng, W. Lishu, E. Ilker, W. Dong, M. ElKabbash, Y.Ting C. T. Lim, M. Hinczewski, G. Strangi, K.-T. Yong, R. E. Simpson, and R. Singh, “Phase-change-material-based low-loss visible-frequency hyperbolic metamaterials for ultrasensitive label-free biosens-ing,”

Advanced Optical Materials7