An efficient approach to global sensitivity analysis and parameter estimation for line gratings
Nando Farchmin, Martin Hammerschmidt, Philipp-Immanuel Schneider, Matthias Wurm, Bernd Bodermann, Markus Bär, Sebastian Heidenreich
AAn efficient approach to global sensitivity analysis andparameter estimation for line gratings
Nando Farchmin a,b , Martin Hammerschmidt c,d , Philipp-Immanuel Schneider c,d , MatthiasWurm a , Bernd Bodermann a , Markus B¨ar a , and Sebastian Heidenreich aa Physikalisch-Technische Bundesanstalt, Braunschweig and Berlin b Technische Universit¨at Berlin, Institute of Mathematics c JCMwave GmbH d Zuse Institute Berlin
ABSTRACT
Scatterometry is a fast, indirect and nondestructive optical method for the quality control in the production oflithography masks. Geometry parameters of line gratings are obtained from diffracted light intensities by solvingan inverse problem. To comply with the upcoming need for improved accuracy and precision and thus for thereduction of uncertainties, typically computationally expansive forward models have been used. In this paperwe use Bayesian inversion to estimate parameters from scatterometry measurements of a silicon line gratingand determine the associated uncertainties. Since the direct application of Bayesian inference using Markov-Chain Monte Carlo methods to physics-based partial differential equation (PDE) model is not feasible due tohigh computational costs, we use an approximation of the PDE forward model based on a polynomial chaosexpansion. The expansion provides not only a surrogate for the PDE forward model, but also Sobol indicesfor a global sensitivity analysis. Finally, we compare our results for the global sensitivity analysis with theuncertainties of estimated parameters.
Keywords: uncertainty quantification, polynomial chaos, global sensitivity analysis, inverse problem, parameterreconstruction, scatterometry
1. INTRODUCTION
Scatterometry is an optical scattering technique frequently used for the characterization of periodic nanostruc-tures on surfaces in semiconductor industry. In contrast to other techniques like electron microscopy, opticalmicroscopy or atomic force microscopy, scatterometry is a non-destructive and indirect method. In particular,geometry parameters of interest are determined by measuring diffraction patterns and solving an inverse problem.A basic requirement for the success of the estimation of parameters is that the underlying model is sensitive tothe parameters of interest. The more sensitive a systems dependence on a certain parameter, the easier is thereconstruction of that parameter. On the other hand, when designing numerical models to simulate experiments,it is often unclear which parameters are necessary for the model, often resulting in a large amount of parametersused for the model even if some of them have little or no influence on the system. For both reasons, a sensitivityanalysis is often useful. A sensitivity analysis gives a priori information about the influence of input parameterson the output.Most often, variance based sensitivity indices are computed by using Monte-Carlo methods, which is com-putationally demanding.
2, 3
Hence, often only the sensitivities of a few, presumably most important parametersare considered. Here we present an expansion into polynomials that yields an algebraic approach to characterizethe global sensitivities for all parameters with less computational cost. In this paper, we determine the geometry parameters of a photomask that consists of multilayered, periodic,straight absorber lines of two optically different materials. The period of the line structure (pitch) is 50 nm andthe geometry parameters of interest are the height of the line h , the width at the middle of the line (critical E-mail: [email protected] a r X i v : . [ phy s i c s . d a t a - a n ] O c t igure 1. Cross section of the photomask with description of the stochastic parameters. The dimensional parametervector is given by ξ = ( h, cd , swa , t, r top , r bot ). The pitch, i.e. the length of periodicity is fixed to 50 nm. dimension) CD, the sidewall angle SWA, the silicon oxide layer thickness t and the radii of the rounding at thetop and bottom corners of the line r top and r bot , respectively. A cross section of the geometry for one period ofthe structure is depicted in Fig. 1. The photomask was illuminated by a light beam of wavelength λ = 266 nmfor different angles of incidence θ = 3 ◦ , ◦ , . . . , ◦ for perpendicular ( φ = 0 ◦ ) and parallel ( φ = 90 ◦ ) orientationof the beam with respect to the grating structure as well as S and P polarization.In the next sections, we will proceed as follows. First, we introduce the forward model of the problem, followedby the global sensitivity analysis and Bayesian inversion based on a polynomial chaos expansion. Second, weexecute the global sensitivity analysis and estimate the posterior distribution from measurement data. Finally,we compare both results.
2. FORWARD MODEL
In principle, the propagation of electromagnetic waves is described by Maxwell’s equations, but for our simplegrating geometry Maxwell’s equations reduce to a single second order partial differential equation, ∇× µ − ∇× E + ω ε E = 0 . (1)Here, ε and µ are the permitivity and permeability and ω is the frequency of the incoming beam. The boundaryconditions are chosen to be transparent in horizontal and periodic in lateral direction. The resulting boundaryvalue problem is solved by a finite element method (FEM). For computations we used the JCMsuite ∗ softwarepackage.The forward model is given by a map of geometry parameters onto S and P polarization of first orderintensities of the scattered light. The parameters ξ used for modelling the grating geometry are depicted inFig. 1. The forward model is represented by the function f ∗ : Ω → R d such that the parameters ξ ∈ Ω ⊂ R M aremapped to diffracted efficiencies for a set of azimuthal angles, incidence angles and polarizations.To obtain a fast evaluation of the surrogate, the function f ∗ is expanded into an orthonormal polynomialbasis { Φ α } α ∈ Λ ⊂ L (Ω; (cid:37) ) f ∗ ( ξ ) ≈ f ( ξ ) = (cid:88) α ∈ Λ f α Φ α ( ξ ) with f α = (cid:90) Ω f ( ξ ) Φ α ( ξ ) d (cid:37) ( ξ ) . (2)The finite set Λ ⊂ N M is a set of multiindices and (cid:37) denotes the multivariate parameter density for the parameters ξ . With this surrogate the evaluation of the model in different parameter realizations is equivalent to the ∗ https://jcmwave.com/jcmsuite valuation of polynomials. Additionally, we remark that this is a nonintrusive method and hence the solver usedfor the deterministic Helmholtz equation (1) needs no adaptation. In our approach, the experimental data y ∈ R d are modelled with the error y j = f j ( ξ ) + ε j , j = 1 , . . . , d where ε j ∼ N (0 , σ j ) describes a normal distributed noise with zero mean, standard deviation and error parameter b , σ j ( b ) = b y j , for b > . (3)The inverse problem in scatterometry is defined by the determination of geometry parameter values ξ and theerror parameter (hyper parameter) b from measured efficiencies y .
3. GLOBAL SENSITIVITY ANALYSIS
Sensitivity analysis is a broadly used tool to identify the influence of uncertain input parameters upon theoutput of a physical system or model. Local methods for sensitivity analysis utilize partial derivatives of theoutput of the system with respect to the various uncertain input parameters to obtain the local parameterdependence of the system. However, since local sensitivity analysis does not cover the hole input space, onlysmall perturbations can be observed. Global variance-based sensitivity analysis on the other hand decomposesthe total system variance Var[ f ∗ ] over the complete parameter space into parts attributing to input parametersand combinations thereof.
10, 11
Among the vast collection of variance-based methods for sensitivity analyses,Sobol indices are a common and widely spread method to characterize parameter sensitivities. The map f ∗ fromabove with expectation E [ f ∗ ] and variance Var[ f ∗ ] can be decomposed as f ∗ ( ξ ) = S + (cid:88) ≤ s ≤ M (cid:88) j < ··· 4. BAYESIAN APPROACH The Bayesian approach provides a statistical method to solve the inverse problem. Following Bayes’ theorem,the posterior density is given by π ( ˆ ξ ; y ) = L ( ˆ ξ ; y ) π ( ˆ ξ ) (cid:82) L ( ˆ ξ ; y ) π ( ˆ ξ ) d ˆ ξ , (7)here the prior density π describes prior knowledge and the likelihood function L contains the informationobtained from the measurement. The vector ˆ ξ consists of geometry parameters ξ and the noise parameter, i.e.ˆ ξ = ( ξ, b ). Assuming normal distributed measurement errors, we choose the likelihood function L ( ˆ ξ ; y ) = d (cid:89) j =1 √ πσ j ( b ) exp (cid:32) − ( f j ( ξ ) − y j ) σ j ( b ) (cid:33) . (8)In the Bayesian framework, the distributions of parameters are in general determined by Markov Chain MonteCarlo (MCMC) sampling where for every sampling step, the forward model has to be evaluated. Normally, thismeans that the Helmholtz equation has to be solved which makes MCMC sampling impractical due to the largenumber of required sampling steps. Since the surrogate only requires evaluations of polynomials, the Bayesianapproach becomes practical for scatterometry measurement evaluations. Table 1. Estimations of parameters and uncertainties obtained from the mean value (mean), the standard deviation(std) and relative standard deviation (rel.std) of the posterior distribution. The domain indicates the support of the priordistribution chosen. parameter domain mean std rel.std h [43 . , . 0] nm 48 . 35 nm 1 . 56 nm 0 . . , . 0] nm 25 . 48 nm 0 . 30 nm 0 . . , . ◦ . ◦ . ◦ . t [4 . , . 0] nm 4 . 96 nm 0 . 18 nm 0 . r top [8 . , . 0] nm 10 . 65 nm 1 . 15 nm 0 . r bot [3 . , . 0] nm 4 . 89 nm 0 . 90 nm 0 . b [0 . , . 1] 0 . 01 0 . . y (1) , . . . , y ( K ) are combined, the posterior distribution of the first measurement can be used asthe prior distribution for the evaluation of the second measurement, i.e. π ( ˆ ξ ; y ( K ) , y ( K − , . . . , y (1) ) = π ( ˆ ξ ) (cid:81) Kk =1 L ( ˆ ξ ; y ( k ) ) (cid:82) π ( ˆ ξ ) (cid:81) Kk =1 L ( ˆ ξ ; y ( k ) ) d ˆ ξ . (9)Note that the model function f in the likelihood function is in general different for different measurement setups. 5. RESULTS For the geometry presented above, we calculated the Sobol indices Eq.(5) for all parameters as depicted inFig. 2. Boxplots for perpendicular and parallel beam incidence as well as S and P polarization are shown,respectively, over 43 angles of incidence θ . The sensitivity parameter correlation is very small and is thereforeomitted here. First of all, we note that the height, critical dimension and oxide layer thickness make up most ofthe total variance. The sensitivity depends also on the polarization and the angle of incidence. For example, theoxide layer thickness t is most sensitive with respect to the S polarization for a perpendicular beam orientation( φ = 0 ◦ ) and the sidewall angle depends highly on the angle of incidence (P polarization, parallel to the beam).With this, we expect that the reconstruction of all parameters is feasible. In particular, the oxide layerthickness and critical dimension should be possible to determine precisely due to their high sensitivity.Next, we apply Bayesian inversion on scatterometry measurements to estimate geometry parameters of theline grating. More details of the measurement setup are described in. 5, 14 For Bayesian inversion it is necessary igure 2. Boxplots of Sobol indices for all geometry parameters, azimutal angles and polarization. Each boxplot includesSobol indices for the whole range of angles of incidence. to chose prior distributions. In our investigations we have chosen uniform priors on the domains given in Table 1.To obtain the posterior distribution, we sampled with a MCMC random walk Metropolis algorithm using thesurrogate. We have chosen a sampling size of 10 samples and a burn in phase of 10 samples. For diagnostics,we applied the Gelman-Rubin criterion, to assure that generated samples are independent. We calculated themarginal posterior densities for all 6 stochastic parameters. All posterior densities are characterized by sharppeaks with mean and standard deviation similar to the previous publication. The mean and standard deviationfor each parameter including the hyperparameter (error parameter) b are shown in Table 1. Since the domainsizes of the parameters vary due to their geometrical meaning, we introduce the relative standard derivation(std). The relative std is the std divided by half the width of the parameter domain:rel . std( η ) = 2 std( η ) β − α where η ∈ [ α, β ] . (10)The relative std shows how the posterior distribution is spread within the domain. For example, if the domainfor the critical dimension is [22 , 28] nm, and the std is 0 . . std = 0 . b depicted in Table 1 show that the relative measurement uncertainty isapproximately 1%.Finally, Fig. 3 displays a comparison between the measurement data and the evaluation of our surrogatemodel using reconstructed geometry parameters. The pointwise relative deviation of the approximation fromthe measurements data is 2% and lower. In the measurement data were evaluated by a Maximum PosteriorApproach (MPA). The MPA searches the global maximum of the posterior and uncertainties are determined bythe local covariance matrix. The difference here is that we calculated the whole posterior distribution. Thishas the advantage that even for multiple peaked and non-Gausian posterior distributions this scheme givesreliable uncertainty estimations. The results obtained in are consistent to our findings. There are only slightdifferences. For example, the marginal distribution for the height h is broad (non-Gausian) yielding largerncertainties. Similarly, the mean values for r top and r bottom are slightly shifted due to the asymmetry of themarginal posterior (non-Gausian). The deviation between the forward model values and the measurement dataof 2% is comparable with that found in. Figure 3. Scattered intensities for the two polarizations and different azimuthal angles. Compared are the measurements ofthe scatterometry experiment and the simulation of the PC surrogate for the mean values of the parameter reconstruction.The bottom graph shows the pointwise deviation. 6. SUMMARY In this paper we applied a polynomial chaos expansion as a surrogate for the forward model in scatterometry.This approach enables us to perform a global sensitivity analysis at low computational costs. Moreover, since thesurrogate only requires the evaluation of polynomials instead of solving the Helmholtz equation, it was feasibleto use a full Bayesian approach to determine the posterior distribution for all geometry parameters. To generatesamples from the posterior distribution, we employed a MCMC Metropolis random work sampling method andchecked the overall independence of the samples obtained by the Gelman-Rubin criterion. The reconstructionresults obtained by the surrogate model compared to those obtained by a Maximum Posterior estimate witha Gauss-Newton like method are consistent and are in line with the predictions from our global sensitivityanalysis. We finally conclude that a Bayesian approach based on the polynomial chaos surrogate gives accurateand reliable estimations for silicon line grating parameters and uncertainties. REFERENCES [1] Hsu, S. and Terry, F., “Spectroscopic ellipsometry and reflectometry from gratings (scatterometry) forcritical dimension measurement and in situ, real-time process monitoring,” Thin Solid Films , 828–836(2004).[2] Homma, T. and Saltelli, A., “Importance measures in global sensitivity analysis of nonlinear models,” Reliability Engineering and System Safety (1), 1–17 (1996).3] Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., and Tarantola, S., “Variance based sen-sitivity analysis of model output. design and estimator for the total sensitivity index,” Computer PhysicsCommunications (2), 259 – 270 (2010).[4] Sudret, B., “Global sensitivity analysis using polynomial chaos expansions,” Reliability Engineering andSystem Safety (7), 964–979 (2008).[5] Hammerschmidt, M., Weiser, M., Santiago, G. X., Zschiedrich, L., Bodermann, B., and Burger, S., “Quan-tifying parameter uncertainties in optical scatterometry using bayesian inversion,” Proc. SPIE 10330, Mod-eling Aspects in Optical Metrology (2017).[6] Wiener, N., “The Homogeneous Chaos,” Amer. J. Math. (4), 897–936 (1938).[7] Cameron, R. H. and Martin, W. T., “The orthogonal development of non-linear functionals in series ofFourier-Hermite functionals,” Ann. of Math. (2) , 385–392 (1947).[8] Ghanem, R. and Spanos, P.-T., “Polynomial chaos in stochastic finite elements,” Journal of AppliedMechanics-transactions of The Asme - J APPL MECH , 197–202 (03 1990).[9] Saltelli, A. and Annoni, P., “How to avoid a perfunctory sensitivity analysis,” Environmental Modelling andSoftware (12), 1508 – 1517 (2010).[10] Sobol, I. M., “Sensitivity estimates for nonlinear mathematical models,” Math. Modeling Comput. Experi-ment (4), 407–414 (1995) (1993).[11] Sobol, I. M., “Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates,” Math. Comput. Simulation (1-3), 271–280 (2001).[12] Heidenreich, S., Gross, H., and Bar, M., “Bayesian approach to the statistical inverse problem of scatterom-etry: Comparison of three surrogate models,” International Journal for Uncertainty Quantification (6)(2015).[13] Heidenreich, S., Gross, H., and B¨ar, M., “Bayesian approach to determine critical dimensions from scat-terometric measurements,” Metrologia (6), S201 (2018).[14] Wurm, M., Bonifer, S., Bodermann, B., and Richter, J., “Deep ultraviolet scatterometer for dimensionalcharacterization of nanostructures: system improvements and test measurements,” Measurement Scienceand Technology (9), 094024 (2011).[15] Gelman, A. and Rubin, D. B., “Inference from iterative simulation using multiple sequences,” StatisticalScience (4), 457–472 (1992).[16] Agocs, E., Bodermann, B., Burger, S., Dai, G., Endres, J., Hansen, P.-E., Nielson, L., Madsen, M. H.,Heidenreich, S., Krumrey, M., et al., “Scatterometry reference standards to improve tool matching andtraceability in lithographical nanomanufacturing,” in [ Nanoengineering: Fabrication, Properties, Optics,and Devices XII ], , 955610, International Society for Optics and Photonics (2015).[17] Heidenreich, S., Gross, H., Henn, M., Elster, C., and B¨ar, M., “A surrogate model enables a bayesianapproach to the inverse problem of scatterometry,” in [ Journal of Physics: Conference Series ], (1),012007, IOP Publishing (2014).[18] Heidenreich, S., Gross, H., Wurm, M., Bodermann, B., and B¨ar, M., “The statistical inverse problemof scatterometry: Bayesian inference and the effect of different priors,” in [ Modeling Aspects in OpticalMetrology V ],9526