[PDF] Efficient Bayesian inversion for shape reconstruction of lithography masks

Abstract

Background: Scatterometry is a fast, indirect and non-destructive optical method for quality control in the production of lithography masks. To solve the inverse problem in compliance with the upcoming need for improved accuracy, a computationally expensive forward model has to be defined which maps geometry parameters to diffracted light intensities. Aim: To quantify the uncertainties in the reconstruction of the geometry parameters, a fast to evaluate surrogate for the forward model has to be introduced. Approach: We use a non-intrusive polynomial chaos based approximation of the forward model which increases speed and thus enables the exploration of the posterior through direct Bayesian inference. Additionally, this surrogate allows for a global sensitivity analysis at no additional computational overhead. Results: This approach yields information about the complete distribution of the geometry parameters of a silicon line grating, which in return allows to quantify the reconstruction uncertainties in the form of means, variances and higher order moments of the parameters. Conclusion: The use of a polynomial chaos surrogate allows to quantify both parameter influences and reconstruction uncertainties. This approach is easy to use since no adaptation of the expensive forward model is required.

Full PDF

EEﬃcient Bayesian inversion for shape reconstruction oflithography masks

Nando Farchmin a,b , Martin Hammerschmidt c,d , Philipp-Immanuel Schneider c,d , MatthiasWurm a , Bernd Bodermann a , Markus B¨ar a , and Sebastian Heidenreich aa Physikalisch-Technische Bundesanstalt, Braunschweig and Berlin b Technische Universit¨at Berlin, Institute of Mathematics c JCMwave GmbH d Zuse Institute Berlin

ABSTRACT

Background:

Scatterometry is a fast, indirect and non-destructive optical method for quality control in theproduction of lithography masks. To solve the inverse problem in compliance with the upcoming need for im-proved accuracy, a computationally expensive forward model has to be deﬁned which maps geometry parametersto diﬀracted light intensities.

Aim:

To quantify the uncertainties in the reconstruction of the geometry parameters, a fast to evaluate surro-gate for the forward model has to be introduced.

Approach:

We use a non-intrusive polynomial chaos based approximation of the forward model which increasesspeed and thus enables the exploration of the posterior through direct Bayesian inference. Additionally, thissurrogate allows for a global sensitivity analysis at no additional computational overhead.

Results:

This approach yields information about the complete distribution of the geometry parameters of asilicon line grating, which in return allows to quantify the reconstruction uncertainties in the form of means,variances and higher order moments of the parameters.

Conclusion:

The use of a polynomial chaos surrogate allows to quantify both parameter inﬂuences and re-construction uncertainties. This approach is easy to use since no adaptation of the expensive forward model isrequired.

Keywords: uncertainty quantiﬁcation, polynomial chaos, inverse problem, parameter reconstruction, scatterom-etry

ACKNOWLEDGEMENTS

This project has received funding form the German Central Innovation Program (ZIM) No. ZF4014017RR7.

1. INTRODUCTION

Scatterometry is an optical scattering technique frequently used for the characterization of periodic nanostruc-tures on surfaces in semiconductor industry (determination of critical dimensions).

In contrast to other tech-niques like electron microscopy, optical microscopy or atomic force microscopy, scatterometry is a non-destructiveand indirect method. In the last decades both feature sizes and the required measurement uncertainty decreasedcontinuously, hence advanced scatterometry techniques are required. Recently, deep ultraviolet (DUV) scat-terometry, extreme ultraviolet (EUV) scatterometry,

1, 8 imaging scatterometry and combinations with whitelight interferometry are developed. In these approaches, geometrical parameters and associated uncertaintiescan be determined from diﬀraction patterns by solving a statistical inverse problem. For an overview on themetrology of surfaces in semiconductor industry see and references therein.We emphasize that scatterometry is an integral measurement method, which means that information ofvariances within the probe are lost due to an averaging over the spot size of the beam. These parameter variations E-mail: [email protected] a r X i v : . [ phy s i c s . d a t a - a n ] M a y ypically lead to a broadening of the diﬀracted beam. This stochastic eﬀect was not taken into account by themodel of the line structure used in this work. Instead the diﬀraction eﬃciencies were calculated from the integralover the whole beam.The inverse problem of scatterometry is in general ill-possed and regularization techniques have to be applied.The geometry is typically parametrized and sought-after parameters are obtained by weighted least squaresminimization, with weights derived directly from uncertainties in the measurements. However, the quality ofthese weights depends highly on the measurements used and itself inﬂuences the reconstruction results of thegeometry parameters. An alternative approach is to apply a maximum likelihood estimate, which introducesa likelihood function based on an error model and optimizes weighting terms as hyper parameters instead ofusing predeﬁned values. Based on the same principle but additionally including some prior knowledge is themaximum posterior approach, which is a state of the art method in parameter reconstructions. In the aboveframeworks, uncertainties are typically obtained from the Fischer information or covariance matrix , which relieson an assumed shape of the posterior. However, the shape of the posterior is generally unknown, hence this canlead to signiﬁcant errors in the estimation of uncertainties if the actual posterior shape diﬀers from the assumedone.The Bayesian approach allows to integrate prior knowledge and approximates the probability density func-tion of the geometry parameters independent of any shape assumptions. Uncertainties obtained by employingBayesian inference are thus much more robust. On the other hand it requires a large number of evaluations ofthe forward model which is not feasible for expensive computations as in the case of scatteromery. To obtaina surrogate model that mitigates the computation time, we employ a polynomial chaos expansion, that is anexpansion into an orthonormal polynomial basis in the parameter space to approximate the forward model witha global polynomial.

17, 18

We additionally show how this surrogate allows for a Bayesian approach to the inverseproblem.In a recent publication, it has been demonstrated that a surrogate of a forward model for scatterometry basedon a polynomial chaos expansion enables Bayesian inversion and the use of Markov Chain Monte Carlo (MCMC)sampling. In this approach, cubature rules on sparce grids of Smolyak type are used to determine the expansioncoeﬃcients and to construct the surrogate model.

19, 20

However, cubature rules and sparce grids are adapted tothe stochastic distributions chosen and less accurate for correlated stochastic input parameters. In the presentwork we used optimal linear regression to obtain the coeﬃcients of the polynomial chaos expansion. This novelapproach uses optimal sampling points, is much more ﬂexible and well suited for extensions to adaptive systems.

Figure 1. Cross section of the photomask with description of the stochastic parameters. The dimensional parametervector is given by ξ = ( h, cd , swa , t, r top , r bot ). The pitch of the computational domain, i.e. the period is ﬁxed to 50 nm. In this paper, we determine the geometry parameters of a photomask that consists of multilayered, periodic,straight absorber lines of two optically diﬀerent materials. The period of the line structure (pitch) is 50 nm andthe geometry parameters of interest are the height of the line h , the width at the middle of the line (criticalimension) CD, the sidewall angle SWA, the silicon oxide layer thickness t and the radii of the rounding at thetop and bottom corners of the line r top and r bot , respectively. A cross section of the geometry for one period ofthe structure is depicted in Fig. 1. The photomask was illuminated by a light beam of wavelength λ = 266 nm(DUV) for diﬀerent angles of incidence θ = 3 ◦ , ◦ , . . . , ◦ for perpendicular ( φ = 0 ◦ ) and parallel ( φ = 90 ◦ )orientation of the beam with respect to the grating structure as well as S and P polarization. The refractiveindices used are n si = 1 .

967 + 4 . i for silicon, n ox = 1 . . i for the top oxide layer and n air = 1 . µ r = 1 .

2. FORWARD MODEL

In principle, the propagation of electromagnetic waves is described by Maxwell’s equations, but for our simplegrating geometry and in the time-harmonic case, Maxwell’s equations reduce to a single second order partialdiﬀerential equation,

15, 21 ∇× µ ( r ) − ∇× E ( r ) − ω ε ( r ) E ( r ) = 0 . (1)Here, ε and µ are the permittivity and permeability, r is the spatial coordinate and ω is the frequency of theincoming beam. We employ the ﬁnite element method (FEM) implemented in the JCMsuite software packageto discretize and solve the corresponding scattering problem formulation on a bounded computational unit cell inweak formulation as described in. This formulation yields a splitting of the complete R n into an interior domainhosting the total ﬁeld (incident and scattered) and an exterior domain where only the purely outward radiatingscattered ﬁeld is present. Appropriate boundary conditions are applied at the boundary of the computationaldomain as depicted in Fig. 1. As the geometry is periodic in lateral direction Bloch-periodic boundary conditionsare applied. In vertical direction the geometry is assumed to be unbounded and thus requires the satisfaction of atransparent boundary condition at the interface. We use an adaptive perfectly matched layer (PML) method

23, 24 to realize the transparent boundary condition and to satisfy the radiation condition for the scattered ﬁeld. Theemployed vectorial FE method uses high-order polynomial ansatz functions deﬁned on the spatial discretizationof the computational domain. The triangulation allows to geometrically resolve the material interfaces and thetangential continuity of the electromagnetic ﬁelds across these interfaces is automatically enforced by the method.The forward model is given by a map of geometry parameters onto S and P polarization of zeroth orderintensities of the scattered light. The parameters ξ used for modelling the grating geometry are depicted inFig. 1. The forward model is represented by the function f ∗ : Ω → R d such that the parameters ξ ∈ Ω ⊂ R M are mapped to diﬀracted eﬃciencies for a set of azimuthal angles, incidence angles and polarizations. Each ofthe d components of f ( ξ ) represents a diﬀerent combination of azimuth, incidence angle and polarization. Allother experimental conditions such as e.g. the wavelength are ﬁxed for this forward model. In our approach, theexperimental data y ∈ R d are modelled with the error y j = f j ( ξ ) + ε j , j = 1 , . . . , d where ε j ∼ N (0 , σ j ) describesa normal distributed noise with zero mean, standard deviation and error parameter b , σ j ( b ) = b y j , for b > . (2)Choosing σ j to depend on a stochastic parameter itself instead of setting it to speciﬁc values allows for anestimation of the measurement error based on the measurement data in the parameter reconstruction and thusincorporates less prior knowledge. The inverse problem is deﬁned as the determination of geometry parametervalues ξ and the error parameter (hyper parameter) b from measured eﬃciencies y .To obtain a fast evaluation of the surrogate, the function f ∗ is expanded into an orthonormal polynomialbasis { Φ α } α ∈ Λ ⊂ L (Ω; (cid:37) ) ∗ ( ξ ) ≈ f ( ξ ) = (cid:88) α ∈ Λ f α Φ α ( ξ ) with f α = (cid:90) Ω f ( ξ ) Φ α ( ξ ) d (cid:37) ( ξ ) . (3)The ﬁnite set Λ ⊂ N M of cardinality P ∈ N is a set of multiindices and (cid:37) denotes the multivariate parameterdensity for the parameters ξ . With this surrogate the evaluation of the model in diﬀerent parameter realizationsis equivalent to the evaluation of polynomials.We want to emphasize, that this approach allows for a global sensitivity analysis of the parameters at almostno additional cost. ? , 17, 28–33

3. OPTIMAL LINEAR REGRESSION

A simple and non-intrusive approach to compute the expansion coeﬃcients in (3) is linear regression. With thereasonable assumption that, for an arbitrary enumeration of Λ, the residuum R ( ξ ) = f ∗ ( ξ ) − (cid:80) P(cid:96) =1 f (cid:96) Φ (cid:96) ( ξ ) is azero mean random variable, we want to ﬁnd coeﬃcients that minimize the variance of the residuum R . In otherwords, we obtain the least-squares minimization problem:Find coeﬃcients f (cid:96) , (cid:96) = 1 , . . . , P such that (cid:90) Ω R ( ξ ) d (cid:37) ( ξ ) = min . (4)To avoid the high dimensional numerical integration in (4), we approximate the integral in a Monte Carlosense by (cid:90) Ω R ( ξ ) d (cid:37) ( ξ ) ≈ N N (cid:88) i =1 R ( ξ ( i ) ) , (5)where ξ (1) , . . . , ξ ( N ) are N realizations of possible geometry parameter values (see the domain column inTab. 1). Since (4) is a quadratic minimization problem, the critical point of the ﬁrst variation yields the wantedminimum. This critical point can be obtained by solving the linear system F = (Ψ T Ψ) − Ψ T F ∗ , (6)where the matrices Ψ and F ∗ are given by Ψ i(cid:96) = Φ (cid:96) ( ξ ( i ) ) and F ∗ i = f ∗ ( ξ ( i ) ). To guarantee that theempirical Gramian Ψ T Ψ is not ill-conditioned, the number of realizations N has to be suﬃciently large. Thechoice of sampling points is in principle arbitrary, but Cohen and Migliorati showed, that sampling from aspeciﬁc weighted least-squares distribution leads to an optimal (minimal) number of samples for a guaranteedwell-conditioned Gramian matrix. Hence, we setd µ = w − d (cid:37) for w − ( ξ ) = 1 P P (cid:88) (cid:96) =1 | Φ (cid:96) ( ξ ) | (7)and draw samples ξ ( i ) ∼ µ . Note that µ is still a probability measure since the polynomials { Φ (cid:96) } areorthogonal and normalized. With this, the number of samples required to guarantee a low condition of theGramian reads N/ log( N ) ≥ c P for some c >

0. Here, we choose c = 4, motivated by the empirical results in. Applying this optimal sampling strategy allows us to compute a surrogate for the forward model using aminimal number of function evaluations. This surrogate will be employed to reconstruct geometry parametersand quantify the reconstruction uncertainties using Bayesian inference. . BAYESIAN APPROACH

The Bayesian approach provides a statistical method to solve the inverse problem. Following Bayes’ theorem,the posterior density is given by π ( ˆ ξ ; y ) = L ( ˆ ξ ; y ) π ( ˆ ξ ) (cid:82) L ( ˆ ξ ; y ) π ( ˆ ξ ) d ˆ ξ , (8)where the prior density π describes prior knowledge and the likelihood function L contains the informationobtained from the measurement under the assumption of a speciﬁc measurement error model. Since the priordensity allows for expert knowledge to inﬂuence the model, it has to be chosen appropriately not to introduce abias on the system. We choose a uniform prior to induce as less information as possible on the compact domainsof the geometry parameters. The computational and implementation eﬀorts of the Bayesian approach are higherthan for the maximum likelihood or least squares methods. However, the posterior yields information about thecomplete probability density function of the geometry parameters and is thus more reliable for the determinationof uncertainties than merely using quantities such as mean and covariance. In addition, the combination of theresults from diﬀerent measurement modalities within the Bayesian framework assures a consistent propagationof uncertainties through all measurement contributions

35, 36 (hybrid metrology) in a way that the posterior forone measurement can be used as the prior for the next.

Table 1. Estimations of parameters and uncertainties obtained from the mean value (mean), the double standard deviation(2 σ ), relative double standard deviation (11) (rel-2 σ ), skewness (skew) and kurtosis of the posterior distribution. Thedomain indicates the support of the prior distribution chosen. parameter domain mean 2 σ rel-2 σ skew kurtosis h / nm [43 . , .

0] 48 .

35 3 .

11 0 . − . . / nm [22 . , .

0] 25 .

48 0 .

59 0 . . . / ◦ [84 . , .

0] 86 .

87 2 .

65 0 . − . . t / nm [4 . , .

0] 4 .

96 0 .

35 0 . . . r top / nm [8 . , .

0] 10 .

65 2 .

30 0 . − . . r bot / nm [3 . , .

0] 4 .

89 1 .

80 0 . − . . b [0 . , .

1] 0 .

01 0 . . . . ξ consists of geometry parameters ξ and the noise parameter, i.e. ˆ ξ = ( ξ, b ). Assuming normaldistributed zero-mean measurement errors, we choose the likelihood function L ( ˆ ξ ; y ) = d (cid:89) j =1 √ πσ j ( b ) exp (cid:32) − ( f ( j ) ( ξ ) − y j ) σ j ( b ) (cid:33) , (9)where f ( j ) is the j -th component of (the vector valued function) f . Note that the form of the measurementerror has to be chosen appropriately not to introduce a bias. A more general approach in our case would benot to impose a zero mean but introduce another random hyper parameter. However, the residuum in Fig. 4suggests that the noise is distributed around zero. In the Bayesian framework, the distributions of parametersare in general determined by Markov-Chain Monte Carlo (MCMC) sampling where for every sampling step,the forward model has to be evaluated. Normally, this means that equation (1) has to be solved which makesMCMC sampling impractical due to the large number of required sampling steps. Since the surrogate onlyrequires evaluations of polynomials, the Bayesian approach becomes practical for scatterometry measurementevaluations. igure 2. Marginal 1D and 2D densities for the posterior of the stochastic parameters. For the 1D densities, the mean(solid line) and the standard deviation (dashed line) are depicted as well. For Bayesian inversion, we have to choose a prior distribution for the parameters, calculate the likelihoodfunction and determine the corresponding posterior distribution. The posterior distribution contains the desiredparameter values and their associated uncertainties. When two or more measurement results from diﬀerentmeasurement sets y (1) , . . . , y ( K ) are combined, the posterior distribution of the ﬁrst measurement can be used asthe prior distribution for the evaluation of the second measurement, i.e. π ( ˆ ξ ; y ( K ) , y ( K − , . . . , y (1) ) = π ( ˆ ξ ) (cid:81) Kk =1 L ( ˆ ξ ; y ( k ) ) (cid:82) π ( ˆ ξ ) (cid:81) Kk =1 L ( ˆ ξ ; y ( k ) ) d ˆ ξ . (10)Note that the model function f in the likelihood function is in general diﬀerent for diﬀerent measurementetups.

5. RESULTS

First we want to emphasize the eﬃciency of our approach. For the scattering problem at hand, it is suﬃcientto use a chaos expansion with 217 terms to achieve a relative empirical L -error of less than 1%. Therefore,in the sense of section 3, we generate approximately 10 samples for the FEM forward model to evaluate. Incomparison, the computation of the function mean, variance and Sobol indices or the generation of posteriorsamples, if done empirically, require more than 10 function evaluations each due to the slow convergence rateof Monte Carlo integration.We apply Bayesian inversion to the scatterometry measurements to estimate geometry parameters of the linegrating. More details of the measurement setup are described in previous works.

5, 15

A global sensitivity analysisfor the geometry parameters indicates that the reconstruction of all parameters is possible, i.e. the forwardmodel is sensitive to all of them. In particular, the oxide layer thickness and critical dimension should be possibleto determine precisely due to their high sensitivity.For Bayesian inversion it is necessary to chose prior distributions. In our investigations we have chosenuniform priors on the domains given in Table 1. To obtain the posterior distribution, we sampled with anMCMC random walk Metropolis-Hastings algorithm using the surrogate. We have chosen a sampling size of 10 samples and a burn in phase of 10 samples. For diagnostics, we applied the Gelman-Rubin criterion, to assurethat the chains have fully explored the posterior. Figure 3. Deviation of posterior marginals from Gaussian distribution with the same mean and variance. The distributionsare depicted in their respective reconstruction domains (see Table 1).

In Fig. 2 the posterior (marginal) densities for all 6 stochastic parameters are shown. All posterior densitiesare characterized by sharp peaks with mean and standard deviation similar to the previous publication. Themean and double standard deviation for each parameter including the hyperparameter (error parameter) b areshown in Table 1. Since the domain sizes of the parameters vary due to their geometrical meaning, we introducethe relative double standard derivation (rel-2 σ ). The rel-2 σ of a stochastic variable η is the double standarddeviation divided by half the width of the parameter domain:el-2 σ = 4 σβ − α where η ∈ [ α, β ] . (11)The rel-2 σ shows how the posterior distribution is spread within the domain. For example, if the domain forthe critical dimension is [22 ,

28] nm, and the 2 σ is 0 . σ is 0 . σ in Table 1 shows that the smallestreconstruction uncertainties are obtained for the critical dimension with rel-2 σ about 20%, followed by the oxidelayer thickness with 35% rel-2 σ . The height has a rel-2 σ of about 62%. The posterior densities of the sidewallangle and the corner rounding are slightly wider distributed at about 90% rel-2 σ . This goes in line with theglobal sensitivity analysis. The results for the error parameter b depicted in Table 1 show that the relativemeasurement uncertainty is approximately 1%.One major advantage of the Bayesian inference is information about the complete posterior distributioninstead of just parameter values obtained from the global minimizer. Looking at the marginals in Fig. 2, it iseasy to verify that the posterior is not Gaussian. The densities of the rounding radii are not symmetric, themarginal distribution of the sidewall angle exhibits a plateau around the mean and the height even suggestsmulti-modalities. Another validation of these observations can be found in the skewness (third moment) andkurtosis (forth moment) of the posterior. These diﬀer (except for the oxide layer thickness) quite signiﬁcantlyfrom the skewness and kurtosis of a Gaussian, see Table 1. The deviation of the marginals from a Gaussian withthe same mean and standard deviation is shown in Fig. 3 for all parameters .In our case the marginal distributions of the posterior are similar enough to a Gaussian distribution thatthe 2 σ conﬁdence interval contains roughly 95% of the mass, as displayed in Table 2. However, in general it ismore reasonable to directly compute intervals of mass concentration (conﬁdence intervals) rather than relyingon the standard deviation to characterize the uncertainties of a distribution, because this can be misleading fornon-Gaussian distribution shapes occurring for example in.

20, 39

Table 2. Double standard deviation and 95% mass conﬁdence intervals of all geometry parameters and the error hyper-parameter. parameter 2 σ interval 95% conﬁdence interval h / nm (45 . , .

47) (45 . , . / nm (24 . , .

07) (24 . , . / ◦ (84 . , .

52) (84 . , . t / nm (4 . , .

31) (4 . , . r top / nm (8 . , .

96) (8 . , . r bot / nm (3 . , .

69) (3 . , . b (0 . , . . , . b , which describesthe mismatch between the surrogate of the forward model f and the measurements. In a previous work a Maximum Posterior Approach (MPA) incorporating the same measurement data was used to determine thegeometry parameters. The MPA searches for the global maximum of the posterior. Uncertainties were determinedlocally with an approximation of the covariance matrix around the maximum posterior point. The diﬀerencehere is that we calculated the whole posterior distribution. This has the advantage that even for multiple peakedand non-Gausian posterior distributions this scheme gives reliable uncertainty estimations. The results obtainedin are consistent to our ﬁndings since the posterior is relatively close to the assumed Gaussian shape. There arenly slight diﬀerences. For example, the marginal distribution for the height h is broad (non-Gaussian) yieldinglarger uncertainties. Similarly, the mean values for r top and r bot are slightly shifted due to the asymmetry of themarginal posterior (non-Gaussian). The deviation between the forward model values and the measurement dataof 2% is comparable with that found in. Figure 4. Scattered intensities for the two polarizations and diﬀerent azimuthal angles. Compared are the measurements ofthe scatterometry experiment and the simulation of the PC surrogate for the mean values of the parameter reconstruction.The bottom graph shows the pointwise deviation.

6. CONCLUSION

In this paper we applied a polynomial chaos expansion as a surrogate for the forward model in scatterometry.Since the surrogate only requires the evaluation of polynomials instead of solving Maxwell’s equation, it wasfeasible to use a full Bayesian approach to determine the posterior distribution for all geometry parameters.To generate samples from the posterior distribution, we employed a MCMC Metropolis random walk samplingmethod and checked the overall independence of the samples obtained by the Gelman-Rubin criterion. Thereconstruction results obtained by the surrogate model compared to those obtained by a Maximum Posteriorestimate with a Gauss-Newton like method are consistent and are in line with the predictions from a globalsensitivity analysis. We conclude that a Bayesian approach based on the polynomial chaos surrogate givesaccurate and reliable estimations for silicon line grating parameters and uncertainties. REFERENCES [1] Hsu, S. and Terry, F., “Spectroscopic ellipsometry and reﬂectometry from gratings (scatterometry) forcritical dimension measurement and in situ, real-time process monitoring,”

Thin Solid Films , 828–836(2004).[2] Mack, C., [

Fundamental principles of optical lithography: the science of microfabrication ], John Wiley &Sons (nov 2008).3] Scholze, F., Soltwisch, V., Dai, G., Henn, M.-A., and Gross, H., “Comparison of CD measurements of anEUV photomask by EUV scatterometry and CD-AFM,” in [

Photomask Technology 2013 ], , 88800O,International Society for Optics and Photonics (2013).[4] Henn, M.-A., Gross, H., Heidenreich, S., Scholze, F., Elster, C., and B¨ar, M., “Improved reconstructionof critical dimensions in extreme ultraviolet scatterometry by modeling systematic errors,” MeasurementScience and Technology (4), 044003 (2014).[5] Wurm, M., Bonifer, S., Bodermann, B., and Richter, J., “Deep ultraviolet scatterometer for dimensionalcharacterization of nanostructures: system improvements and test measurements,” Measurement Scienceand Technology (9), 094024 (2011).[6] Agocs, E., Bodermann, B., Burger, S., Dai, G., Endres, J., Hansen, P.-E., Nielson, L., Madsen, M. H.,Heidenreich, S., Krumrey, M., et al., “Scatterometry reference standards to improve tool matching andtraceability in lithographical nanomanufacturing,” in [ Nanoengineering: Fabrication, Properties, Optics,and Devices XII ], , 955610, International Society for Optics and Photonics (2015).[7] Wurm, M., Endres, J., Probst, J., Schoengen, M., Diener, A., and Bodermann, B., “Metrology of nanoscalegrating structures by UV scatterometry,” Optics express (3), 2460–2468 (2017).[8] Raymond, C. J., Murnane, M. R., Sohail, S., Naqvi, H., and McNeil, J. R., “Metrology of subwavelengthphotoresist gratings using optical scatterometry,” Journal of Vacuum Science & Technology B: Microelec-tronics and Nanometer Structures Processing, Measurement, and Phenomena (4), 1484–1495 (1995).[9] Madsen, M. H. and Hansen, P.-E., “Imaging scatterometry for ﬂexible measurements of patterned areas,” Optics express (2), 1109–1117 (2016).[10] Paz, V. F., “Solving the inverse grating problem by white light interference fourier scatterometry,” Light:Science & Applications (11), e36 (2012).[11] Germer, T. A., Patrick, H. J., Silver, R. M., and Bunday, B., “Developing an uncertainty analysis for opticalscatterometry,” in [ Metrology, Inspection, and Process Control for Microlithography XXIII ], , 72720T,International Society for Optics and Photonics (2009).[12] Orji, N. G., Badaroglu, M., Barnes, B. M., Beitia, C., Bunday, B. D., Celano, U., Kline, R. J., Neisser,M., Obeng, Y., and Vladar, A., “Metrology for the next generation of semiconductor devices,” Natureelectronics (10), 532 (2018).[13] El Gawhary, O., Kumar, N., Pereira, S., Coene, W., and Urbach, H., “Performance analysis of coherentoptical scatterometry,” Applied Physics B (4), 775–781 (2011).[14] Henn, M.-A., Gross, H., Scholze, F., Wurm, M., Elster, C., and B¨ar, M., “A maximum likelihood approachto the inverse problem of scatterometry,”

Opt. Express (12), 12771–12786 (2012).[15] Hammerschmidt, M., Weiser, M., Santiago, X. G., Zschiedrich, L., Bodermann, B., and Burger, S., “Quan-tifying parameter uncertainties in optical scatterometry using Bayesian inversion,” in [ Modeling Aspects inOptical Metrology VI ], Bodermann, B., Frenner, K., and Silver, R. M., eds., , 8 – 17, InternationalSociety for Optics and Photonics, SPIE (2017).[16] Heidenreich, S., Gross, H., Wurm, M., Bodermann, B., and B¨ar, M., “The statistical inverse problemof scatterometry: Bayesian inference and the eﬀect of diﬀerent priors,” in [

Modeling Aspects in OpticalMetrology V ], , 95260U, International Society for Optics and Photonics (2015).[17] Sudret, B., “Global sensitivity analysis using polynomial chaos expansions,” Reliability Engineering andSystem Safety (7), 964–979 (2008).[18] Xiu, D., “Fast numerical methods for stochastic computations: A review,” Commun. Comput. Phys ,242–272 (2009).[19] Heidenreich, S., Gross, H., Henn, M., Elster, C., and B¨ar, M., “A surrogate model enables a bayesianapproach to the inverse problem of scatterometry,” in [ Journal of Physics: Conference Series ], (1),012007, IOP Publishing (2014).[20] Heidenreich, S., Gross, H., and B¨ar, M., “Bayesian approach to determine critical dimensions from scat-terometric measurements,” Metrologia (6), S201 (2018).[21] Monk, P., [ Finite element methods for Maxwell’s equations ], Numerical Mathematics and Scientiﬁc Com-putation, Oxford University Press, New York (2003).22] Pomplun, J., Burger, S., Zschiedrich, L., and Schmidt, F., “Adaptive Finite Element Method for Simulationof Optical Nano Structures,”

Physica Status Solidi (B) , 3419–3434 (oct 2007).[23] Berenger, J.-P., “A perfectly matched layer for the absorption of electromagnetic waves,”

Journal of Com-putational Physics (2), 185–200 (1994).[24] Zschiedrich, L.,

Transparent boundary conditions for Maxwells equations: numerical concepts beyond thePML method , phd thesis, Freie Universit¨at Berlin (2009).[25] Wiener, N., “The Homogeneous Chaos,”

Amer. J. Math. (4), 897–936 (1938).[26] Cameron, R. H. and Martin, W. T., “The orthogonal development of non-linear functionals in series ofFourier-Hermite functionals,” Ann. of Math. (2) , 385–392 (1947).[27] Ghanem, R. and Spanos, P.-T., “Polynomial chaos in stochastic ﬁnite elements,” Journal of AppliedMechanics-transactions of The Asme - J APPL MECH , 197–202 (03 1990).[28] Sobol, I. M., “Sensitivity estimates for nonlinear mathematical models,” Math. Modeling Comput. Experi-ment (4), 407–414 (1995) (1993).[29] Sobol, I. M., “Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates,” Math. Comput. Simulation (1-3), 271–280 (2001).[30] Homma, T. and Saltelli, A., “Importance measures in global sensitivity analysis of nonlinear models,” Reliability Engineering and System Safety (1), 1–17 (1996).[31] Saltelli, A. and Annoni, P., “How to avoid a perfunctory sensitivity analysis,” Environmental Modelling andSoftware (12), 1508 – 1517 (2010).[32] Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M., and Tarantola, S., “Variance based sen-sitivity analysis of model output. design and estimator for the total sensitivity index,” Computer PhysicsCommunications (2), 259 – 270 (2010).[33] Farchmin, N., Hammerschmidt, M., Schneider, P.-I., Wurm, M., Bodermann, B., B¨ar, M., and Heidenreich,S., “Eﬃcient global sensitivity analysis for silicon line gratings using polynomial chaos,” in [

Modeling Aspectsin Optical Metrology VII ], Bodermann, B., Frenner, K., and Silver, R. M., eds.,

Proc. SPIE , 115 –121, International Society for Optics and Photonics, SPIE (2019).[34] Cohen, A. and Migliorati, G., “Optimal weighted least-squares methods,”

SMAI Journal of ComputationalMathematics , 181–203 (2017).[35] Silver, R. M., Zhang, N. F., Barnes, B. M., Qin, J., Zhou, H., and Dixson, R., “A bayesian statistical modelfor hybrid metrology to improve measurement accuracy,” in [ Modeling Aspects in Optical Metrology III ], , 808307, International Society for Optics and Photonics (2011).[36] Silver, R. M., Barnes, B. M., Zhang, N. F., Zhou, H., Vladar, A., Villarrubia, J., Kline, J., Sunday, D.,and Vaid, A., “Optimizing hybrid metrology through a consistent multi-tool parameter set and uncertaintymodel,” in [ Metrology, Inspection, and Process Control for Microlithography XXVIII ], , 905004, Inter-national Society for Optics and Photonics (2014).[37] Heidenreich, S., Gross, H., and B¨ar, M., “Bayesian approach to the statistical inverse problem of scatterom-etry: Comparison of three surrogate models,” International Journal for Uncertainty Quantiﬁcation (6)(2015).[38] Gelman, A. and Rubin, D. B., “Inference from iterative simulation using multiple sequences,” StatisticalScience (4), 457–472 (1992).[39] Fern´andez Herrero, A., Pﬂ¨uger, M., Probst, J., Scholze, F., and Soltwisch, V., “Applicability of the debye-waller damping factor for the determination of the line-edge roughness of lamellar gratings,” Optics Ex-press27