High-order numerical method for the nonlinear Helmholtz equation with material discontinuities in one space dimension
aa r X i v : . [ m a t h - ph ] S e p High-order numerical method for the nonlinearHelmholtz equation with material discontinuities inone space dimension
G. Baruch a G. Fibich a S. Tsynkov b , a Department of Applied Mathematics, School of Mathematical Sciences, Tel AvivUniversity, Ramat Aviv, Tel Aviv 69978, Israel b Department of Mathematics, North Carolina State University, Box 8205, Raleigh, NC27695, USA
Abstract
The nonlinear Helmholtz equation (NLH) models the propagation of electromagneticwaves in Kerr media, and describes a range of important phenomena in nonlinear optics andin other areas. In our previous work, we developed a fourth order method for its numericalsolution that involved an iterative solver based on freezing the nonlinearity. The methodenabled a direct simulation of nonlinear self-focusing in the nonparaxial regime, and aquantitative prediction of backscattering. However, our simulations showed that there is athreshold value for the magnitude of the nonlinearity, above which the iterations diverge.In this study, we numerically solve the one-dimensional NLH using a Newton-type non-linear solver. Because the Kerr nonlinearity contains absolute values of the field, the NLHhas to be recast as a system of two real equations in order to apply Newton’s method. Ournumerical simulations show that Newton’s method converges rapidly and, in contradistinc-tion with the iterations based on freezing the nonlinearity, enables computations for veryhigh levels of nonlinearity.In addition, we introduce a novel compact finite-volume fourth order discretization forthe NLH with material discontinuities. Our computations corroborate the design fourthorder convergence of the method.The one-dimensional results of the current paper create a foundation for the analysis ofmulti-dimensional problems in the future.
Key words:
Kerr nonlinearity, nonlinear optics, inhomogeneous medium, discontinuouscoefficients, finite volume discretization, compact scheme, high order method, artificialboundary conditions (ABCs), two-way ABCs, traveling waves, complex valued solutions,Frechét differentiability, Newton’s method.Preprint submitted toElsevier October 27, 2018
Introduction
The nonlinear Helmholtz equation (NLH) ∆ E ( x ) + ω c n E = 0 , n ( x , | E | ) = n ( x ) + 2 n ( x ) n ( x ) | E | , (1)governs the propagation of linearly-polarized, time-harmonic electromagneticwaves in Kerr-type dielectrics. Here, x = [ x , . . . , x D ] are the spatial coordinates, E = E ( x ) denotes the scalar electric field, ω is the laser frequency, c is the speedof light in vacuum, ∆ = ∂ x + . . . + ∂ x D is the D -dimensional Laplacian, n is thelinear index of refraction, and n is the Kerr coefficient. In this study, we considerthe case of an inhomogeneous medium in which both n and n can vary in space.We assume that the medium is lossless, i.e., that n and n are real. Furthermore,we consider only the case in which the electric field E and the material coefficients n and n vary in one spatial direction that we identify with the direction of prop-agation and denote by z . Hence, equation (1) reduces to the one-dimensional cubicNLH: d E ( z ) dz + ω c (cid:16) n ( z ) + 2 n ( z ) n ( z ) | E | (cid:17) E = 0 . (2) ✛ ✲✲ ✂✄ ✁(cid:0) ✲ Grated Kerr medium Z z = ik z Incoming fieldReflected field
R e − ik z E inc e ik z Transmittedfield z = Z max Figure 1. A grated Fabry-Perot etalon.
The ordinary differential equation (2)arises, for example, when model-ing nonlinear optical devices, suchas the Fabry-Perot etalon [1], seeFigure 1. This device consists of alayer or slab of Kerr medium lo-cated between ≤ z ≤ Z max . TheKerr slab is surrounded by a lin-ear homogeneous medium, so that n ≡ n ext and n ≡ for z < andfor z > Z max . We consider the casewhen an incoming plane wave E = E inc e ik z impinges normally on the slab at theinterface z = 0 . Here, k = ω c n ext is the linear wavenumber in the surrounding Email addresses: [email protected] (G. Baruch), [email protected] (G.Fibich), [email protected] (S. Tsynkov).
URLs: ∼ guybar (G. Baruch), ∼ fibich (G. Fibich), ∼ stsynkov (S. Tsynkov). Corresponding author. Phone: +1-919-515-1877, Facsimile: +1-919-513-7336. The re-search of this author was supported by the US NSF, Grant ν ( z ) = ( n ( z ) /n ext ) , ǫ ( z ) = 2 n ( z ) n ( z ) / ( n ext ) . Then, equation (2) transforms into d E ( z ) dz + k (cid:16) ν ( z ) + ǫ ( z ) | E | (cid:17) E = 0 , (3)where ν ≡ and ǫ ≡ for z < and for z > Z max .We assume that the Kerr material is either homogeneous, i.e., ν ( z ) ≡ ν int , ǫ ( z ) ≡ ǫ int , ≤ z ≤ Z max , (4)or layered (piecewise-constant). The latter case corresponds to a one-dimensionalgrating (see Figure 1), where for some given partition: z < · · · < ˜ z l < · · · < ˜ z L = Z max , (5a)we have: ν ( z ) ≡ ˜ ν l , ǫ ( z ) ≡ ˜ ǫ l , for z ∈ (˜ z l , ˜ z l +1 ) . (5b)At the interfaces ˜ z l , the boundary conditions for Maxwell’s equations imply con-tinuity of the field E ( z ) and its first derivative dEdz (see Appendix A). Note that,the material coefficients ν ( z ) and ǫ ( z ) are, generally speaking, discontinuous at theKerr medium boundaries z = 0 and z = Z max .When equation (3) is considered on the interval ≤ z ≤ Z max , it needs to besupplemented by boundary conditions at z = 0 and z = Z max . Outside of thisinterval, the field propagates linearly with ν ≡ and ǫ ≡ . Therefore, for z ≤ ,the total field is composed of a given incoming wave and the unknown reflectedwave E ( z ) = E inc e ik z + Re − ik z . (6a)For z ≥ Z max , the electric field is given by the transmitted wave E ( z ) = T e ik z . (6b)The transmitted and reflected waves shall be interpreted as outgoing with respect tothe domain of interest [0 , Z max ] . Note that the left-traveling wave Re − ik z containsthe field reflected from the interface z = 0 (i.e., the reflection per se), as well as thefield generated by nonlinear backscattering inside the interval [0 , Z max ] .The transmitted field (6b) satisfies a Sommerfeld-type homogeneous differentialrelation at z = Z max + : ddz − ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = Z max + = ddz − ik ! T e ik z (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = Z max + = 0 . E and dEdz at z = Z max yields the following boundary condi-tion: ddz − ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = Z max = 0 . (7a)Similarly, at z = 0 − we can write, see (6a): ddz + ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z =0 − = ddz + ik ! (cid:16) E inc e ik z + Re − ik z (cid:17)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z =0 − = 2 ik E inc . Hence, the continuity of E and dEdz at z = 0 leads to the boundary condition: ddz + ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z =0 = 2 ik E inc . (7b)The boundary conditions (7a) and (7b) enable the propagation of outgoing wavesfrom inside the interval [0 , Z max ] toward its exterior. In addition, the boundary con-dition (7b) prescribes the given incoming wave E inc e ik z at the left boundary z = 0 ,and is therefore referred to as the two-way boundary condition.The problem (3), (7) can be rescaled as follows: ˜ E = E/E inc , ˜ ǫ = ǫ | E inc | . Hence, we can assume hereafter with no loss of generality that E inc = 1 . (8)Under this rescaling, a variation in ǫ represents a variation in the input beampower | E inc | .Closed form solutions for equation (3) in a homogeneous medium (4) were first ob-tained by Wilhelm [2] for a real-valued field, and by Marburger and Felber [3] fora complex-valued field. These solutions were later used by Chen and Mills [4] tosolve equation (3) with the boundary conditions (7), as follows. Since the NLH (3)is a second order ODE, the boundary condition (7a) at z = Z max , together with achoice of the transmitted field amplitude T , constitute an initial value problem at z = Z max that has a unique solution E = E ( z ; T, ǫ ) . For an arbitrary value of T , the solution E ( z ; T ) does not, generally speaking, satisfy the boundary condi-tion (7b) at z = 0 . One can therefore use a shooting approach to find the value(s) of T = T ( ǫ ) for which the solutions of the initial value problem also satisfy (7b) [andhence the full problem (3), (7), (8)]. When the nonlinearity ǫ is small, the function T = T ( ǫ ) is single-valued, see Figure 2(A). When the nonlinearity exceeds a cer-tain threshold ǫ > ǫ c , the function T = T ( ǫ ) becomes multi-valued, which impliesnonuniqueness of the solution. The nonuniqueness occurs at certain intervals of ǫ Note that as | E ( Z max ) | = | T | , a choice of T is equivalent to a choice of E at z = Z max . ε |T| first switchback ε (A)~~ second switchback ε ~~ ε |T| c ε ε ’c (B) Figure 2. (A) The transmittance | T | as a function of ǫ for the solution of the one-dimen-sional NLH (3) with ν ≡ , k = 8 and Z max = 10 . (B) Zoom-in on the first region ofswitchback-type nonuniqueness for . ≈ ǫ c ≤ ǫ ≤ ǫ ′ c ≈ . . and is of a switchback type, see Figure 2(B). In the physics literature, this behavioris often referred to as bistability.In a subsequent paper [5], Chen and Mills extended their approach to the case ofpiecewise-constant material coefficients (5), which corresponds to the formulationthat we analyze numerically in this paper, see Section 1.2. Knapp, Papanicolaou andWhite [6] considered the case of a large homogeneous slab and a weak nonlinearity.They showed that the threshold for nonuniqueness ǫ c scales as Z − . They alsotreated random media, which we do not consider here.In addition to analytical studies, equation (3) was also studied numerically usinga shooting approach [7–10] which is conceptually similar to the one of Chen andMills [4, 5]. Unlike [4, 5], however, in these studies, for each value of T at Z max theCauchy problem is solved numerically, rather than analytically. The advantage ofthis approach over [4,5] is that it can be applied to media with a smooth variation ofmaterial properties [7–10] and to lossy materials [7], as opposed to only piecewise-constant media in [4, 5]. The main shortcoming of the shooting approach, however,is that it cannot be generalized to multidimensional problems.The NLH can also be solved numerically as a full boundary value problem. In ourprevious work [11–13], we solved the multidimensional NLH (1) for the homoge-neous Kerr medium with ν ≡ and ǫ ≡ const . To do that, we developed andimplemented nonlocal two-way boundary conditions similar to (7); they provided akey element of the numerical methodology. In [14, 15], Suryanto et al. used a finiteelement scheme for solving the one-dimensional NLH (3) subject to the two-wayboundary conditions. The finite element approximation constructed in [14, 15] al-lowed for material discontinuities at the grid nodes. This approximation was of amixed order; the linear terms of (3) were approximated with fourth order accuracy,while the nonlinearity was approximated with second order accuracy. Note that the ν ≡ corresponds to the case for which the linear index of refraction n ( x ) is the same both inside and outside the Kerr medium. ν and/or ǫ are dis-continuous, the second derivative of the solution E ( z ) is discontinuous. The pres-ence of discontinuities in the solution must be properly accounted for when build-ing a numerical approximation of equation (3). In particular, a naive high-orderapproximation may lose its accuracy as the grid is refined. In this context, we notethat the coefficient ǫ is always discontinuous at least at z = 0 and z = Z max . Such adiscontinuity cannot be addressed by a scheme that assumes smoothness across theboundary, such as the standard (five-point) fourth order central-difference schemeused in our previous work [11–13]. Indeed, we have observed in [13] a deteriorationof the fourth order accuracy at fine grid resolutions.In the current paper, we present a novel fourth order numerical scheme for theNLH (3) based on a compact approximation of finite volume type . The use of inte-gration over the grid cells allows us to correctly account for the discontinuities in ν ( z ) and ǫ ( z ) both at the outer boundaries and inside the Kerr medium. The fourthorder accuracy is attained on a compact three node stencil by using the differen-tial equation (3) to eliminate the leading terms of the truncation error. A similarequation-based approach was used by Singer and Turkel in [16] to obtain a com-pact high order approximation for the linear Helmholtz equation. As we shall see,however, construction of a compact approximation for finite volumes, and espe-cially in the nonlinear case is considerably more complex. In particular, we needto use Birkhoff-Hermite interpolation to approximate the field between the gridnodes with fourth order accuracy. To the best of our knowledge, this is the first timeever that a genuine fourth order scheme is built for the NLH with discontinuouscoefficients.While we analyze the formal accuracy of our schemes, a theoretical error estimateis beyond the scope of this paper, because the problem is nonlinear. Instead, weevaluate the numerical error experimentally, and demonstrate that the schemes pos-sess the anticipated rate of convergence. Moreover, in Appendix B we provide aconvergence proof for a linear problem with a material discontinuity, in which thematerial coefficient ν is in the form of a step function. In this case, we can obtainclosed form solutions for both the continuous equation and its discrete counterpart,and use them to establish the error estimates. Note that this simple setup capturesthe key features of our treatment of material discontinuities by finite volumes, andillustrates that the scheme indeed has the design rate of grid convergence.The second key improvement offered by the current paper is in the methodologyused to solve the nonlinear equations on the grid. Previously [11–13], we solvedthe NLH by simple iterations based on freezing the nonlinearity; a similar approachwas also employed by Suryanto et al. in [14,15]. While this approach has allowed usto obtain a number of interesting solutions to the NLH with a weak nonlinearity, forsomewhat stronger nonlinearities the iterations would cease to converge [11–15].In order to overcome this limitation, in this paper we solve the NLH (3) usingNewton’s iterations. Applying Newton’s method to the NLH is not straightforward6hough, since the nonlinearity in (3) is nondifferentiable in the sense of Frechét.We recall that the solutions of the NLH (3) must be complex valued, otherwise it isimpossible to adequately describe traveling waves in the time-harmonic context. Hence, to obtain a proper Newton’s linearization we recast the complex equation (3)as a system of two real equations. In the literature, Newton’s method has been ap-plied to similar problems. For example, in the work of Gómez-Gardeñes, et al. [17],the authors solve the steady-state nonlinear Schrödinger equation on a lattice byNewton’s method (see also [18–21]). Our particular implementation of Newton’smethod for the NLH leads to a block tridiagonal structure of the Jacobians, whichenables an efficient inversion. We also note that the application of Newton’s methodto a higher order discretization of the the NLH with material discontinuities bringsalong additional complications (Section 3).Our computations show that the use of Newton’s iterations leads to a very consider-able improvement in performance over the previous "frozen-nonlinearity" iterativemethods [11–15], as it enables robust numerical solution of the NLH for strongnonlinearities. In fact, solutions can be computed for nonlinearities far above thethreshold of nonuniqueness, and even for the nonlinearities that lead to materialbreakdown in an actual physical setting. Note that in the latter case, the Kerr modelitself becomes inapplicable.The paper is organized as follows: In Section 1.2 we present a summary of the math-ematical formulation. In Section 2, we describe our discrete approximation. Webegin with the finite volume formulation (Section 2.1), then introduce two secondorder approximations (Section 2.2 and Section 2.3) and the fourth order approxi-mation (Section 2.4), and finally construct the boundary conditions in the discretesetting (Section 2.5). In Section 3, we build a Newton’s solver for the Frechét non-differentiable NLH. To clarify the presentation, we first illustrate the approach fora single variable (Section 3.1), then generalize to multivariable nondifferentiablefunctions (Section 3.2), apply the method to the three discrete approximations ofthe NLH (Section 3.3), and finally discuss the choice of the initial guess (Sec-tion 3.4). A summary of the numerical method is given in Section 4. Numericalcomputations are performed in Section 5, examining the convergence of the iter-ations and the computational error of the methods (Section 5.2 and Section 5.3,respectively). We conclude with a discussion in Section 6.
In the current paper, we will be solving the one-dimensional NLH [cf. (3)]: d E ( z ) dz + k (cid:16) ν ( z ) + ǫ ( z ) | E | (cid:17) E = 0 , < z < Z max , (9a) This is reflected by the fact that the boundary conditions (7) are complex. ddz + ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z =0 = 2 ik , ddz − ik ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = Z max = 0 . (9b)In formulae (9a), (9b), we assume the scaling E inc = 1 , see (8). The medium on theinterval [0 , Z max ] can have piecewise-constant material coefficients: ν ( z ) ≡ ˜ ν l , ǫ ( z ) ≡ ˜ ǫ l , for z ∈ (˜ z l , ˜ z l +1 ) . (9c)For simplicity only, we assume a uniform partition into L − homogeneous slabsof equal width ∆ z = Z max L − : ˜ z l = ( l − z, l = 1 , . . . , L. (9d)The homogeneous case (4) corresponds to the case L = 2 . At the interfaces ˜ z l , thesolution E ( z ) and its first derivative dEdz are continuous, but the second derivative d Edz is discontinuous. Away from the interfaces, i.e., inside every interval (9c), thematerial coefficients ν ( z ) and ǫ ( z ) are constant, and the NLH (9a) implies that thefield E ( z ) is infinitely differentiable. In this section, we present our discretization of problem (9). First, we introduce anintegral formulation of the NLH (9a) (see Section 2.1) and discretize it on the grid(Section 2.2, 2.3, and 2.4). Then, we implement the boundary conditions (9b) in afully discrete framework (Section 2.5).
Let a, b ∈ [0 , Z max ] , a < b , and let us integrate equation (9a) between the points a and b with respect to z . Since dEdz is continuous everywhere, we obtain: dE ( b ) dz − dE ( a ) dz + k Z ba (cid:16) ν ( z ) + ǫ ( z ) | E | (cid:17) E dz = 0 . (10)Equation (10) can be interpreted as the integral conservation law that correspondsto the NLH (9a). It is easy to see that for sufficiently smooth solutions the twoformulations are equivalent. Indeed, if we require that the integral relation (10)hold for any pair of points a and b , then at every point z where d Edz exists theNLH (9a) can be reconstructed from the conservation law (10) by a straightforwardpassage to the limit: a → z − , b → z + 0 . However, the integral formulation (10)8akes sense even when the differential equation per se loses its validity because ofinsufficient regularity of the solution, i.e., when the material coefficients undergojump discontinuities and the second derivative d Edz becomes discontinuous.Let us introduce a uniform grid of M nodes on the interval ≤ z ≤ Z max : z m = ( m − h, where h = Z max M − , m = 1 , . . . , M. (11a)We choose h so that ∆ z of (9d) is an integer multiple of h . This choice guaranteesthat material discontinuities will only be located at the grid nodes, i.e., that both ν ( z ) and ǫ ( z ) will be constant within each grid cell: ν ( z ) ≡ ν m , ǫ ( z ) ≡ ǫ m , z ∈ ( z m , z m +1 ) . (11b)To approximate the NLH on the grid (11a), we apply the integral relation (10) be-tween the midpoints of every two neighboring cells, i.e., for [ a, b ] = [ z m − , z m + ] , m = 1 , , . . . , M . Then, using formula (11b), we arrive at dEdz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z m + 12 z m − + k ν m − Z z m z m − E dz + k ǫ m − Z z m z m − | E | E dz + k ν m Z z m + 12 z m E dz + k ǫ m Z z m + 12 z m | E | E dz = 0 . (12)Equation (12) relates integrals of the unknown continuous function E ( z ) with itsderivatives at z = z m ± . We will approximate the individual terms in (12) usingthe nodal values E ( z m ) ≡ E m , m = 1 , . . . , M , of the field. The resulting schemewill be equivalent to compact finite differences on the regions of smoothness of thesolution, where it could also be obtained without using the integral formulation (seeSection 6.1 for further discussion of an approach alternative to the use of integralformulation). Otherwise, i.e., near the discontinuities, the scheme will approximatethe integral relation (10), and hence (12), rather than the differential equation (9a).Recall that the material coefficients ν ( z ) and ǫ ( z ) are constant in between thegrid nodes and consequently, E ( z ) is infinitely differentiable within each grid cell.Hence, all the integrands in (12) can be approximated with fourth order accuracyusing cubic polynomials. Together with a fourth order approximation of the deriva-tives, this yields a fourth order compact scheme for the NLH (9a), see Section 2.4.An even simpler piecewise linear approximation of E ( z ) yields a second ordercompact scheme, and we will describe its two different versions, in Sections 2.2and 2.3. In addition to providing a reference point for comparison, the second orderschemes allow us to introduce the general framework and notations exploited laterfor building the more complex fourth order method.9 .2 Second order approximation We approximate the first term on the left-hand side of (12) using central differences: dEdz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z m + 12 z m − = E m +1 − E m h − E m − E m − h + O (cid:16) h (cid:17) . (13)Without assuming any additional regularity of E ( z ) beyond the continuity of itsfirst derivative, we merely have the difference of two fluxes approximated withsecond order accuracy. If, however, the material coefficients are continuous at z m ,i.e., if ν m = ν m − and ǫ m = ǫ m − , then d Edz and higher derivatives exist and arecontinuous as well. In this case, if we divide the undivided second difference onthe right-hand side of (13) by h , then a straightforward Taylor-based argument willyield a second order central-difference approximation of d Edz : E m +1 − E m + E m − h = d Edz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z m + O (cid:16) h (cid:17) . (14)To approximate the third integral on the left-hand side of (12), we linearly interpo-late E ( z ) on the interval [ z m , z m + ] : E ( z ) ≡ E ( z m + hζ ) = (1 − ζ ) E m + ζ E m +1 + O (cid:16) h (cid:17) , ζ ∈ (cid:20) , (cid:21) . (15)Then, substituting expression (15) into the third integral of (12), we have: Z z m + 12 z m E dz = h Z / [(1 − ζ ) E m + ζ E m +1 ] dζ + O ( h )= 3 h E m + h E m +1 + O ( h ) . (16)Likewise, we can linearly interpolate the cubic term | E | E on [ z m , z m + ] to obtain: Z z m + 12 z m | E | E dz = 3 h | E m | E m + h | E m +1 | E m +1 + O ( h ) . The expressions for the subinterval [ z m − , z m ] are derived similarly, we merelyreplace ν m , ǫ m , and E m +1 with ν m − , ǫ m − , and E m − , respectively. Finally, byassembling all the terms we arrive at the following second order approximation of The flux difference on the right-hand side of (13) is exactly the same as we would haveobtained if we approximated the second derivative d Edz by the standard piecewise linearGalerkin finite elements, see, e.g., [22]; having a continuous first derivative of E ( z ) issufficient for building this approximation. m = 1 , , . . . , M : hF m ( E ) def = E m +1 − E m h − E m − E m − h + hk ν m − E m − + 3 E m hk ν m E m + E m +1 hk ǫ m − | E m − | E m − + 3 | E m | E m hk ǫ m | E m | E m + | E m +1 | E m +1 . (17a)The vector E = [ E , . . . , E M ] T was used as an argument of F m ( E ) in for-mula (17a), because F m operates on E m − , E m , and E m +1 . Hence, for the interfacenodes m = 1 and m = M , the system of equations (17a) requires the addition ofthe ghost nodes m = 0 and m = M + 1 , respectively. The value of the field at theghost nodes will be determined by the boundary conditions, see Section 2.5. Notealso that the notation E m needs to be interpreted differently in different expressions.Namely, in (13), (15), (16) and similar formulae that introduce approximation ofthe individual terms in (12), E m denotes the value of the exact continuous solutionof (9) on the grid (11a). In formula (17a), however, E m denotes the approximatediscrete solution, which we calculate numerically.If the material coefficients ν and ǫ are continuous at z m , i.e., if ν m − = ν m and ǫ m − = ǫ m , then d Edz exists at this point along with higher order derivatives. In thatcase, equation (17a) reduces to F m ( E ) = E m +1 − E m + E m − h + k ν m E m − + 6 E m + E m +1 k ǫ m | E m − | E m − + 6 | E m | E m + | E m +1 | E m +1 . (17b)In scheme (17b), the second derivative d Edz is approximated by the conventionalsecond order central differences (14), but the non-differentiated terms are evaluatedas weighted sums over three neighboring nodes rather than pointwise. Instead of interpolating the cubic term | E | E as in Section 2.2, one can substitutethe linear interpolation (15) into the corresponding integrals of (12). This approachis slightly more cumbersome. As we will see in Section 2.4, however, it will enablethe construction of the fourth order compact discretization.11t is convenient to adopt a tensor notation. First, we recast formula (15) as E ( z m + hζ ) = X i =0 F i ( ζ ) E m + i + O (cid:16) h (cid:17) , where F = 1 − ζ , F = ζ . This representation, when substituted into the linear integral term of (12), providesan equivalent alternative form of equation (16): Z z m + 12 z m E dz = h X i =0 Z F i dζ !| {z } f i E m + i + O (cid:16) h (cid:17) = h X i =0 f i E m + i + O (cid:16) h (cid:17) , while its substitution into the cubic term | E | E = E ∗ E yields: Z z m + 12 z m | E | E dz = h Z / X i =0 F i ( ζ ) E ∗ m + i ! X j =0 F j ( ζ ) E m + j X k =0 F k ( ζ ) E m + k ! dζ + O (cid:16) h (cid:17) = h X i,j,k =0 Z F i F j F k dζ !| {z } g ijk E ∗ m + i E m + j E m + k + O (cid:16) h (cid:17) = h X i,j,k =0 g ijk E ∗ m + i E m + j E m + k + O (cid:16) h (cid:17) . The constants f i and g ijk in the previous formulae are defined as f i = Z F i dζ , g ijk = Z F i F j F k dζ , i, j, k = 0 , . Evaluation of these integrals yields: f = 38 , f = 18 ,g = 1564 , g = 11192 , g = 5192 , g = 164 . Note that the tensor elements g ijk are symmetric with respect to any permutation ofthe indices i , j , and k , e.g., g = g = g .Altogether, the integrals over [ z m , z m + ] in (12) are approximated as Z z m + 12 z m (cid:16) ν m E + ǫ m | E | E (cid:17) dz = hν m X i =0 f i E m + i + hǫ m X i,j,k =0 g ijk E ∗ m + i E m + j E m + k + O ( h ) , [ z m − , z m ] are approximated the same way. Hence, the al-ternative second order discretization of the integral relation (12) can be written as hF m ( E ) def = E m +1 − E m h − E m − E m − h + hk ν m − X i =0 f i E m − i + hk ǫ m − X i,j,k =0 g ijk E ∗ m − i E m − j E m − k + hk ν m X i =0 f i E m + i + hk ǫ m X i,j,k =0 g ijk E ∗ m + i E m + j E m + k = 0 , (18a)where m = 1 , . . . , M . Similarly to (17a), E m in formula (18a) should be interpretedas the approximate solution on the grid (11a), and its values at the ghost nodes m = 0 and m = M + 1 are determined in Section 2.5. Again, if ν and ǫ arecontinuous and E is smooth at z m , then scheme (18a) reduces to a central-differencesecond order scheme for the NLH (9a): F m ( E ) = E m +1 − E m + E m − h + k ν m X i =0 f i ( E m − i + E m + i )+ k ǫ m X i,j,k =0 g ijk (cid:16) E ∗ m − i E m − j E m − k + E ∗ m + i E m + j E m + k (cid:17) = 0 . (18b)Note that the linear terms in (18a) and (18b) are identical to those in (17a)and (17b), respectively, they are merely expressed in a different form. In this section, we build a compact fourth order discretization for the integral rela-tion (12). The general idea of all compact schemes is to use the original differentialequation to obtain the higher order derivatives that could help cancel the leadingterms of the truncation error and thus improve the order of accuracy. This idea hasbeen implemented, e.g., by Singer and Turkel in [16] for a finite-difference ap-proximation of the linear Helmholtz equation. Hereafter, we adopt some elementsof their equation-based approach. As we shall see though, some additional com-plications arise when this approach is applied to the approximation of the integralrelation (12), which, in particular, involves nonlinearity.The differential equation (9a) inside the grid cells can be used to evaluate the one-13ided second derivatives at the grid nodes as follows: E ′′ m + def = d Edz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = z m + = − k (cid:16) ν m + ǫ m | E m | (cid:17) E m , (19a) E ′′ ( m +1) − def = d Edz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = z m +1 − = − k (cid:16) ν m + ǫ m | E m +1 | (cid:17) E m +1 . (19b)Subsequently, formulae (19) will be used to approximate each of the five terms onthe left-hand side of (12) with fourth order accuracy.To approximate the fluxes E ′ m ± in (12), we first use the Taylor expansion: E ′ m + = E m +1 − E m h − h E (3) m + + O (cid:16) h (cid:17) . Then, we approximate the third derivative E (3) m + with second order accuracy anduse (19), which yields: E (3) m + = E ′′ ( m +1) − − E ′′ m + h + O (cid:16) h (cid:17) = − k (cid:16) ν m + ǫ m | E m +1 | (cid:17) E m +1 + k (cid:16) ν m + ǫ m | E m | (cid:17) E m h + O (cid:16) h (cid:17) . Finally, we introduce the dimensionless grid size ˜ h = k h and obtain: E ′ m + = 1 h h (cid:16) ν m + ǫ m | E m +1 | (cid:17)! E m +1 − h h (cid:16) ν m + ǫ m | E m | (cid:17)! E m + O (cid:16) h (cid:17) . We repeat the calculation for E ′ m − . Altogether, the flux difference, i.e., the firstterm in (12), is approximated as dEdz (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z m + 12 z m − = E m +1 − E m h ν m ˜ h ! − E m − E m − h ν m − ˜ h ! + ǫ m ˜ h | E m +1 | E m +1 − | E m | E m h − ǫ m − ˜ h | E m | E m − | E m − | E m − h + O (cid:16) h (cid:17) . Lemma 1
Let E ∈ C ([ z m , z m +1 ]) . Let the values E m = E ( z m ) and E m +1 = E ( z m +1 ) be known along with the values of the one-sided second derivatives E ′′ m + and E ′′ ( m +1) − . Then, the function E ( z ) is approximated with fourth order accuracy: E ( z m + ζ h ) = P ( ζ ) + O (cid:16) h (cid:17) , z ∈ [ z m , z m +1 ] , (20a) by the Hermite-Birkhoff cubic polynomial: P ( ζ ) = E m − h E ′′ m + ! (1 − ζ ) + h E ′′ m + (1 − ζ ) + E m +1 − h E ′′ ( m +1) − ! ζ + h E ′′ ( m +1) − ζ . (20b) Moreover, given E m , E m +1 , E ′′ m + , and E ′′ ( m +1) − , the polynomial (20b) is unique. PROOF.
See Appendix C.Note that, in general, for the construction of P on a given individual interval [ z m , z m +1 ] , it is unimportant that the derivatives in formula (20b) are one-sided.We only use one-sided derivatives in order to be able to use the result in the con-text of discrete approximation on the entire grid, when the material coefficientsand hence second derivatives of the solution can undergo jumps at the grid nodes.We also note that the cubic polynomials built in accordance with Lemma 1 are notequivalent to the standard cubic splines, see Section 6.1 for more detail.Substituting expressions (19) into formula (20b), we obtain a fourth order approxi-mation of E ( z ) on [ z m , z m +1 ] : E ( z m + ζ h ) = h (cid:16) ν m + ǫ m | E m | (cid:17)! E m (1 − ζ ) − ˜ h (cid:16) ν m + ǫ m | E m | (cid:17) E m (1 − ζ ) + h (cid:16) ν m + ǫ m | E m +1 | (cid:17)! E m +1 ζ − ˜ h (cid:16) ν m + ǫ m | E m +1 | (cid:17) E m +1 ζ + O (cid:16) h (cid:17) . For convenience, let us rewrite the previous expression as E ( z m + ζ h ) = X i =0 F i ( ζ ; ˜ h, ν m ) v + i + O (cid:16) h (cid:17) , (21)15here F ( ζ ; ˜ h, ν ) = (1 − ζ ) ν ˜ h (cid:16) − (1 − ζ ) (cid:17)! , F ( ζ ; ˜ h, ν ) = ζ ν ˜ h (cid:16) − ζ (cid:17)! ,F ( ζ ; ˜ h, ν ) = ˜ h − ζ ) (cid:16) − (1 − ζ ) (cid:17) , F ( ζ ; ˜ h, ν ) = ˜ h ζ (cid:16) − ζ (cid:17) , and v +0 = E m , v +1 = ǫ m | E m | E m , v +2 = E m +1 , v +3 = ǫ m | E m +1 | E m +1 . Then, substituting expression (21) for E ( z ) into the last two integral terms of (12)and evaluating the integrals with respect to ζ , we have: Z z m + 12 z m E dz = h X i =0 Z F i ( ζ ; ν m , ˜ h ) dζ !| {z } f i v + i + O (cid:16) h (cid:17) = h X i =0 f i v + i + O ( h ) , Z z m + 12 z m | E | E dz = h X i,j,k =0 Z F i F j F k dζ !| {z } g ijk ( v + i ) ∗ v + j v + k + O (cid:16) h (cid:17) = h X i,j,k =0 g ijk · ( v + i ) ∗ v + j v + k + O (cid:16) h (cid:17) . The constants f i and g ijk in the previous formulae are defined as f i ( ν, ˜ h ) = Z F i ( ζ ; ν, ˜ h ) dζ , g ijk ( ν, ˜ h ) = Z F i F j F k dζ ,i, j, k = 0 , . . . , , (22)and their values are given in Table 1. As in the case of the second order scheme(Section 2.3), it is clear from the definition of the tensor elements g ijk , formula (22),that they are symmetric with respect to any permutation of the indices { i, j, k } .Evaluation of the integrals of (12) for the interval [ z m − , z m ] is nearly identical; itonly requires replacing ( ν m , ǫ m ) with ( ν m − , ǫ m − ) and v + i with v − i , where v − = E m , v − = ǫ m − | E m | E m , v − = E m − , v − = ǫ m − | E m − | E m − . Finally, by combining the approximations for all the individual terms in (12) we A direct computation of all the tensor elements in (22) could be quite tedious and proneto errors. This task, however, can be efficiently automated, see Appendix D. oefficients explicit expression f ( ν ) (cid:18) ν (cid:16) ˜ h (cid:17) (cid:19) f ( ν ) (cid:16) ˜ h (cid:17) f ( ν ) (cid:18) ν (cid:16) ˜ h (cid:17) (cid:19) f ( ν ) (cid:16) ˜ h (cid:17) g ( ν ) + ν ( ˜ h ) + ν ( ˜ h ) + ν ( ˜ h ) g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) g ( ν ) + ν ( ˜ h ) + ν ( ˜ h ) + ν ( ˜ h ) g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) g ( ν ) + ν ( ˜ h ) + ν ( ˜ h ) + ν ( ˜ h ) g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) g ( ν ) + ν ( ˜ h ) + ν ( ˜ h ) + ν ( ˜ h ) g ( ν ) ( ˜ h ) + ( ˜ h ) ν + ( ˜ h ) ν g ( ν ) ( ˜ h ) + ( ˜ h ) νg ( ν ) ( ˜ h ) Table 1Coefficients (22) of the fourth-order compact approximation (23a). Only 20 coefficientsare given out of o total of 64, because g ijk are symmetric with respect to the permutationsof indices, i.e. g = g , g = g , etc. obtain the following fourth order scheme: hF m ( E ) def = E m +1 − E m h ν m h k ! − E m − E m − h ν m − h k ! ǫ m h k | E m +1 | E m +1 − | E m | E m h − ǫ m − h k | E m | E m − | E m − | E m − h (23a) + hk ν m − X i =0 f i ( ν m − ) v − i + hk ǫ m − X i,j,k =0 g ijk ( ν m − )( v − i ) ∗ v − j v − k + hk ν m X i =0 f i ( ν m ) v + i + hk ǫ m X i,j,k =0 g ijk ( ν m )( v + i ) ∗ v + j v + k = 0 , where m = 1 , . . . , M . As in the case of second order approximations, the value ofthe field on the ghost nodes m = 0 and m = M + 1 will be determined from theboundary conditions, see Section 2.5.Similarly to the second order cases (Sections 2.2 and 2.3), if ν and ǫ are continuousat a given node z m , then E is smooth at this location and the scheme (23a) reducesto the following fourth order scheme for the differential equation (9a): F m ( E ) = E m +1 − E m + E m − h h k ν m ! + k ǫ m (cid:16) | E m +1 | E m +1 − | E m | E m + | E m − | E m − (cid:17) + k ν m X i =0 f i ( ν m ) (cid:16) v − i + v + i (cid:17) + k ǫ m X i,j,k =0 g ijk ( ν m ) (cid:16) ( v − i ) ∗ v − j v − k + ( v + i ) ∗ v + j v + k (cid:17) = 0 . (23b)Note that in the simplest case of a linear equation with constant coefficients, ǫ m ≡ and ν m ≡ ν = const , scheme (23b) transforms into E m − − E m + E m +1 h + k ν E m − + 4 E m + E m +1 h k ν E m − + 18 E m + 7 E m +1
384 = 0 . (24)It can be verified that the scheme (24) is equivalent (up to terms of order O ( h ) andhigher) to the standard three-point fourth order compact approximation E m − − E m + E m +1 h + k ν E m − + 10 E m + E m +1
12 = 0 . (25)of the linear constant coefficient Helmholtz equation [16].18 .5 Two-way boundary conditions We now derive the discrete version of the two-way boundary conditions (9b) at theinterface z = 0 and z = Z max . Recall that the two-way boundary condition (7b) wasconstructed in Section 1 so as to facilitate the propagation of the outgoing wavesthrough the interface z = 0 and at the same time to prescribe the given incomingsignal. This means that the solution to equation (9a) for z ≤ is to be composed ofa given incoming wave and the outgoing wave, which is not known ahead of time.Since for z ≤ the material is a homogeneous linear dielectric with ν ≡ and ǫ ≡ , we have: E ( z ) = E inc e ik z + Re − ik z , z ≤ , (26)and the boundary condition is derived from the continuity of E and E ′ at z = 0 .Our approach to constructing the discrete boundary condition for the scheme isto approximate (26) using closed form solutions of the corresponding differenceequation. This will provide the value of the solution at the ghost node E in termsof that at the boundary node E and the incoming beam E inc . Then, E can be elim-inated from the equation F [ E ] = 0 . A survey of methods for setting the boundaryconditions at external artificial boundaries can be found in [23]. In the context ofthe one-dimensional NLH, the continuous two-way boundary conditions are dis-cussed in [4]. For the multidimensional NLH, the continuous and discrete two-wayboundary conditions are constructed and implemented in [11–13].Since ν m ≡ and ǫ m ≡ for m = 0 , − , . . . (i.e., for z ≤ ), both the secondorder approximation and the fourth order approximation of Section 2 reduce to asymmetric constant-coefficient three-point discretization of the form: F m ( E ) = L E m − − L E m + L E m +1 , m = 0 , − , . . . , (27)where the coefficients L and L are different for each specific approximation. Forthe second order discretizations (17a) and (18a) we have: L = ˜ h − − , L = ˜ h − + 18 , while for the fourth order discretization (23a) we have: L = ˜ h − − − h , L = ˜ h − + 16 + 7384 ˜ h . The general solution of the difference equation (27) is C q m + C q − m , where q = L /L + i q − ( L /L ) and q − = L /L − i q − ( L /L ) (28)are roots of the characteristic equation L q − L q + L = 0 . As can be easilyseen from (28), | q | = 1 and q − = q ∗ . Moreover, one can show that the solution19 m approximates the right-traveling wave e ik z ≡ e ik h ( m − , and the solution q − m approximates the left-traveling wave e − ik z ≡ e − ik h ( m − , with respective ordersof accuracy (second or fourth), see [11–13] for more detail.Consequently, the discrete counterpart of formula (26) for m ≤ can be written as E m = E inc q m − + Rq − m , m = 1 , , − , . . . . (29)From equation (29) considered for m = 1 and for m = 0 we can express the valueof the solution at the ghost node E as E = ( q − − q ) E inc + qE . (30a)The discrete version of the two-way boundary condition (9b) at z = 0 is thenobtained by substituting E from (30a) into the discrete equation F [ E ] = 0 , i.e.,into equation (17a), (18a) or (23a) with m = 1 .Similarly, the discrete version of the Sommerfeld boundary condition (9b) at z = Z max is E M +1 = qE M . (30b)This relation is substituted into the discrete equation F M [ E ] = 0 . The discrete approximations (17a), (18a) and (23a) are coupled systems of nonlin-ear algebraic equations. In our previous work [11–13], we solved similar systemsby an iteration scheme based on freezing the nonlinearity | E | in equation (1). Indoing so, we have observed that the convergence of iterations was limited to rela-tively low-power incoming beams, i.e., weak nonlinearities.In this section, we describe a different iteration scheme for solving the NLH (9a)based on Newton’s method. Newton’s method cannot be applied to equation (9a)directly, because | E | is not differentiable in the Cauchy-Riemann sense and hencethe entire operator is not differentiable in the sense of Frechét. This difficulty andthe way to overcome it are first discussed in Section 3.1 through the considerationof Newton’s method for a single-variable complex function. The method is thenextended to multivariable functions in Section 3.2, and its application to the dis-cretizations of Section 2 is considered in Section 3.3. Note that the particular imple-mentation of Newton’s method presented hereafter leads to a convenient block tridi-agonal structure of the Jacobians that enables efficient numerical inversion ( O ( M ) time). 20 .1 A single complex variable The basic idea is, in fact, quite simple — while the function | E | is not differen-tiable with respect to E , it is differentiable with respect to Re ( E ) and Im ( E ) asa function of two real variables. Hence, Newton’s linearization can be obtained ifone complex equation is recast as a system of two real equations. Let us first recallNewton’s method for solving the scalar equation F ( E ) , where F is differentiable with respect to E . We denote the exact solution by ˜ E ,the j-th iterate by E ( j ) , and their difference by δE = ˜ E − E ( j ) . Using the Taylorexpansion around E ( j ) we have: F ( ˜ E ) = F ( E ( j ) + δE ) = F ( E ( j ) ) + dFdE (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) E = E ( j ) δE + O (cid:16) | δE | (cid:17) . Introducing the differential of F at E ( j ) : δF = J ( E ( j ) ) δE, where J ( E ) = dFdE , (31)we can then write: δF = F ( E ( j ) + δE ) − F ( E ( j ) ) + O (cid:16) | δE | (cid:17) = − F ( E ( j ) ) + O (cid:16) | δE | (cid:17) , and consequently, J ( E ( j ) ) δE = δF = − F ( E ( j ) ) + O (cid:16) | δE | (cid:17) . (32)Neglecting the O ( | δE | ) term in (32) and solving the equation with respect to δE we obtain the next iterate E ( j +1) = E ( j ) + δE : E ( j +1) = E ( j ) − h J ( E ( j ) ) i − F ( E ( j ) ) . (33)If the initial guess E (0) is chosen sufficiently close to ˜ E , then the sequence ofNewton’s iterations (33) is known to converge to the exact solution ˜ E as j → ∞ .Next, we consider the scalar equation F ( E ) = | E | E − E ∗ E − . (34)The modulus | E | in (34) is not differentiable with respect to E in the Cauchy-Riemann sense. However, F is differentiable as a function of two variables Re ( E ) and Im ( E ) or, alternatively, E and E ∗ . Hence, F ( ˜ E ) = F ( E ( j ) + δE ) = F ( E ( j ) ) + ( E ( j ) ) δE ∗ + 2 | E ( j ) | δE + O (cid:16) | δE | (cid:17) . ∂F∂E = 2 | E | and ∂F∂E ∗ = E , the analogue of (31) is δF = J δE + J δE ∗ , where J ( E ) = ∂F∂E , J ( E ) = ∂F∂E ∗ . (35)Consequently, the equivalent of (32) is J ( E ( j ) ) δE + J ( E ( j ) ) δE ∗ = δF = − F ( E ( j ) ) + O (cid:16) | δE | (cid:17) . (36)To solve equation (36) for δE and obtain the equivalent of (33), we separate thereal and imaginary parts of the function F and the independent variable E . This isconvenient to do by representing them as real × column vectors: E b E = Re ( E ) Im ( E ) , F b F = Re ( F ) Im ( F ) . Then, multiplication by a complex number and conjugation correspond to matrixoperations on R , which leads to a real Jacobian in (35). Indeed, multiplication bya complex number c can be represented as c · z d c · z = Re ( c · z ) Im ( c · z ) = Re ( c ) − Im ( c ) Im ( c ) Re ( c ) Re ( z ) Im ( z ) . If we associate a × real matrix bb c with a given complex number c : bb c = Re ( c ) − Im ( c ) Im ( c ) Re ( c ) = Re ( c ) + Im ( c ) −
11 0 , then c · z bb c · b z. Similarly, complex conjugation is a left multiplication by the matrix diag[1 , − : z ∗ c z ∗ = − Re ( z ) Im ( z ) = − b z. δF = | E ( j ) |
00 2 | E ( j ) | | {z }bb J Re ( δE ) Im ( δE ) + Re ( E ( j ) ) − Im ( E ( j ) ) Im ( E ( j ) ) Re ( E ( j ) ) | {z }bb J − Re ( δE ) Im ( δE ) = cc J + cc J − δ b E def = cc J δ b E, (37)where cc J = cc J + cc J − . Having derived the real Jacobian cc J , we neglect the quadratic terms in (36) to obtainthe following Newton’s iteration: b E ( j +1) − b E ( j ) = − (cid:20)cc J ( E ( j ) ) (cid:21) − b F ( E ( j ) ) . We now apply the procedure outlined in Section 3.1 to a system of the form F ( E ) =0 , where E = [ E , . . . , E M ] T ∈ C M and F = [ F , . . . , F M ] T ∈ C M . We would liketo solve the equations using Newton’s iterations of the type: E ( j +1) − E ( j ) = − h J ( E ( j ) ) i − F ( E ( j ) ) , where J ( E ) is the appropriate Jacobian of F ( E ) . As, however, the individual com-ponents of the vector F are not differentiable in the Cauchy-Riemann sense withrespect to the components of E , the Frechét differential of F ( E ) and the corre-sponding Jacobian can only be introduced as in Section 3.1, by recasting the equa-tion using the real and imaginary parts of all variables.As in Section 3.1, the variation of F ( E ) in terms of the field E and its conjugate E ∗ is given by δ F ( E ) = J δ E + J δ E ∗ , where J = ∂ F ∂ E and J = ∂ F ∂ E ∗ . Let us represent E and F as M × column vectors with real components:23 E = (cid:20) Re ( E ) , Im ( E ) , . . . Re ( E m ) , Im ( E m ) , . . . Re ( E M ) , Im ( E M ) (cid:21) T , b F = (cid:20) Re ( F ) , Im ( F ) , . . . Re ( F m ) , Im ( F m ) , . . . Re ( F M ) , Im ( F M ) (cid:21) T . To obtain the real Jacobian J , we will represent the complex matrices J and J asreal matrices of dimension M × M . Let A be a complex M × M matrix. Foreach entry A lm , we substitute the × real block dd A lm : A cc A = dd A . . . dd A M ... . . . ... dd A M . . . [[ A MM = Re ( A ) − Im ( A ) . . . Re ( A M ) − Im ( A M ) Im ( A ) Re ( A ) . . . Im ( A M ) Re ( A M ) ... ... . . . ... ...Re ( A M ) − Im ( A M ) . . . Re ( A MM ) − Im ( A MM ) Im ( A M ) Re ( A M ) . . . Im ( A MM ) Re ( A MM ) . Introducing the matrix direct product A ⊗ B as the matrix obtained by replacingeach entry A lm of A by the block A lm · B , we can write: cc A = Re ( A ) ⊗ + Im ( A ) ⊗ −
11 0 . Similarly, the conjugation of a column vector E can be represented as E ∗ d E ∗ = I M ⊗ − b E , I M ⊗ − = − . . . − M × M , where I M is the M × M identity matrix.Then, the differential of the real function b F is: δ b F ( b E ) = cc J δ b E + cc J δ d E ∗ = cc J δ b E , where the real Jacobian is given by the M × M matrix cc J = cc J + cc J · I M ⊗ . .3 Differentiation of F ( E ) with respect to E and E ∗ In this section, we discuss the actual differentiation of F ( E ) , i.e., the evaluation of J = ∂ F ∂ E and J = ∂ F ∂ E ∗ . As we shall see, the tensor notation of Sections 2.3 and 2.4prove extremely useful in this context.Using the identities ∂E ∗ i ∂E k = ∂E i ∂E ∗ k = 0 , ∂E i ∂E k = ∂E ∗ i ∂E ∗ k = δ ik = , i = k , i = k , we first differentiate the linear terms in F ( E ) and obtain for q = − , , : ∂∂E ∗ m + q ( L E m − − L E m + L E m +1 ) = 0 ,∂∂E m + q ( L E m − − L E m + L E m +1 ) = L δ − ,q − L δ ,q + L δ ,q , where the notation L , L for the coefficients of the scheme was introduced inSection 2.5. The nonlinear terms of the second order scheme (17a) are differentiatedas ∂∂E m | E m | E m = 2 | E m | , ∂∂E ∗ m | E m | E m = E m , and similarly for | E m ± | E m ± .For the alternative second order scheme (18a) we have: ∂∂E ∗ m + q X i,j,k =0 g ijk E ∗ m + i E m + j E m + k = X i,j,k =0 g ijk δ iq E m + j E m + k = X j,k =0 g qjk E m + j E m + k , and, using the symmetry of g ijk : ∂∂E m + q X i,j,k =0 g ijk E ∗ m + i E m + j E m + k = X i,k =0 g iqk E ∗ m + i E m + k + X i,j =0 g ijq E ∗ m + i E m + j = 2 X i,j =0 g ijq E ∗ m + i E m + j . Similarly, the nonlinear terms of the fourth order scheme (23a) are differentiated as ∂∂E m + q X i =0 f i v + i = X i =0 f i ∂v + i ∂E m + q , ∂∂E ∗ m + q X i =0 f i v + i = X i =0 f i ∂v + i ∂E ∗ m + q , ∂∂E m + q X i,j,k =0 g ijk · ( v + i ) ∗ v + j v + k = X i,j,k =0 g ijk v + i v + j ∂ ( v + k ) ∗ ∂E m + q + 2( v + i ) ∗ v + j ∂v + k ∂E m + q ! ,∂∂E ∗ m + q X i,j,k =0 g ijk · ( v + i ) ∗ v + j v + k = X i,j,k =0 g ijk v + i v + j ∂ ( v + k ) ∗ ∂E ∗ m + q + 2( v + i ) ∗ v + j ∂v + k ∂E ∗ m + q ! . Convergence of Newton’s iterations is known to be sensitive to how close the ini-tial guess E (0) happens to be to the solution E . Hereafter, we use two differentstrategies for choosing the initial guesses.When we test the convergence of Newton’s iterations in the vicinity of the exactsolution (of the discrete system of equations), we use for the initial guess the closedform solution of the continuous problem (9) of Chen and Mills [4, 5]. Indeed, wecan expect this solution to be either O ( h ) or O ( h ) close to the exact solution ofthe discrete system of equations F ( E ) = 0 . This expectation, which is based onthe accuracy analysis of Sections 2.2, 2.3, and 2.4 and does not involve a stabilityproof, is later corroborated experimentally (see Section 5).A more interesting case, of course, is when the solution is not known ahead oftime. In order to choose the initial guess in this case, we adopt a continuationapproach in the nonlinearity parameter ǫ . Namely, we increase ǫ in a series of in-crements: ǫ < ǫ < . . . < ǫ n , where at the j-th stage we apply Newton’s method to the NLH (9a) with ǫ = ǫ j using the solution from the j-1 stage with ǫ j − as the initial guess. In doing so, thevalue of ǫ n is the target nonlinearity parameter for a given computation, and it canbe large. At the beginning stage j = 0 , the initial guess may be chosen either as thesolution of the linear problem with ǫ = 0 , or as the solution obtained by iterationschemes based on freezing the nonlinearity as in [11–15], which converge for weaknonlinearities. An integral formulation of the NLH (9a) is discretized on the grid (11a) and writtenin the form F ( E ) = 0 . The operator F ( E ) is given by (17a), (18a) or (23a) at theinterior nodes m = 1 , . . . , M , while the ghost nodes E and E M +1 are specified This would be the case, in particular, for the multidimensional NLH.
26y (30a) and (30b), respectively. The resulting system of nonlinear equations islinearized: δ F ( E ) = J δ E + J δ E ∗ , J = ∂ F ∂ E , J = ∂ F ∂ E ∗ , where J and J are calculated in Section 3.3. An equivalent linearized form isobtained using the R M representation of Section 3.2: δ b F = cc J δ b E , cc J = cc J + cc J · I M ⊗ − . Subsequently, the real Jacobian cc J is used to build the sequence of Newton’s itera-tions: b E ( j +1) − b E ( j ) = − (cid:20)cc J ( E ( j ) ) (cid:21) − b F ( E ( j ) ) . As we shall see, it is sometimes useful to use a relaxation scheme: b E ( j +1) − b E ( j ) = − ω (cid:20)cc J ( E ( j ) ) (cid:21) − b F ( E ( j ) ) . (38)where the relaxation parameter is typically chosen in the range . ≤ ω ≤ . . Thevalue ω = 1 reduces (38) back to the original Newton’s method.The initial guess E (0) is taken as the closed form continuous solution when theconvergence of Newton’s iterations is first studied. In general, continuation by thenonlinearity parameter ǫ is used to compute the solutions for strong nonlinearities.The last important component of the overall numerical method is the inversion ofthe Jacobian. As the problem we are currently solving is one-dimensional, we areusing a direct sparse solver to evaluate (cid:20)cc J ( b E ( j ) ) (cid:21) − for every j = 0 , , , . . . . Forall our schemes, both second order and fourth order accurate, the Jacobian has threenon-zero diagonals composed of × blocks, which enables an efficient solutionby a sparse method. In this section, we experimentally assess the computational error and the conver-gence of iterations for the new method, by comparing it with other methods that wehave used previously to solve the NLH [11–13]. Since in the one-dimensional casethe closed form solutions are available explicitly, the numerical error is evaluateddirectly. All computations in this section are conducted with double precision.27 .1 Reference methods
In Section 5.2.1, we compare convergence of Newton’s iterations with that of theiterations based on freezing the nonlinearity: d E ( j +1) dz + ν ( z ) E ( j +1) + ǫ ( z ) | E ( j ) | E ( j +1) = 0 . (39)For every j = 0 , , , . . . , the next iterate E ( j +1) is obtained by solving the linear,variable coefficients, differential equation (39). This ”freezing” approach was usedin our earlier work [11–13] and also by Suryanto et al. [14, 15]. In addition, we usea relaxation method based on (39): E ( j +1) = (1 − ω ) E ( j ) + ωE ( j +1 / , (40)where E ( j +1 / in (40) is the solution of (39). Note that (40) is an analogue of (38).In Section 5.3.1 we evaluate the error of the compact schemes built in Section 2.We compare it with the error of the standard central-difference discretizations oforder two: E m − − E m + E m +1 h + k ( ν m + ǫ m | E m | ) E m = 0 (41)and of order four: − E m − + 16 E m − − E m + 16 E m +1 − E m +2 h + k ( ν m + ǫ m | E m | ) E m = 0 . (42)In all the simulations, we made sure that the iterations’ convergence was sufficientto enable a robust evaluation of the discretization error, i.e., to distinguish betweenthe error of the difference scheme and the error due to the possible “underconver-gence” of our iterations. Furthermore, we verified that the new iterative method(Newton’s) and the freezing iterative method (39), when they both converge, pro-vide the same error (Section 5.3.1). In this section, we discuss convergence of Newton’s method and compare it withthat of the nonlinear iterations (39) and (40). The parameters used are ν ≡ , k = 8 , Z max = 10 , and a uniform nonlinearity profile ǫ ( z ) ≡ const . Note that ν ≡ corresponds to the case of the linear index of refraction being the sameinside and outside the Kerr medium. For these parameters, the first nonuniquenessregion occurs around ǫ = ǫ c ≈ . (see Figure 2).28 .2.1 Local convergence ε |T| A CD EBc ε ε ’c Figure 3. Local convergence experi-ments for Newton’s iterations performedin the region of the first switchbackof the one-dimensional NLH (9a) (seeFigure 2). Each of the initial guessesnear A–E converged to the correspond-ing discrete solution. In addition, thecontinuation approach with A as the ini-tial guess and ǫ = 0 . converged to B,while continuation with E as the initialguess and ǫ = 0 . converged to D. The goal of the first series of compu-tations is to determine how the magni-tude of nonlinearity (i.e., the value of ǫ )affects the convergence of Newton’s it-erations relative to that of nonlinear it-erations (39). To achieve this goal, wechoose the initial guess to be the point-wise values of the continuous exact solu-tion on the grid, which, as noted, is closeto the actual discrete solution for finegrids. To distinguish between the issuesrelated to iterations’ convergence andthose pertinent to a specific discretiza-tion, we choose one particular scheme,the simplest second order scheme (41),for all the convergence experiments inthis subsection.The nonlinear iteration scheme (39) con-verges for ǫ < . ≈ . ǫ c . Its relax-ation analogue (40) with ω = 0 . allowsus to increase the convergence range upto about . ≈ . ǫ c . Decreasing the value of ω does not seem to have a significanteffect. For ǫ above these thresholds the iterations diverge, and the divergence occursalso for other discretizations that we have used (Section 2) rather than only for (41).Therefore, it shall be interpreted as a limitation of the iteration procedure itself. Thisdivergence is not related to the onset of nonuniqueness in the NLH, because theconvergence breaks down far below the nonlinearity threshold for uniqueness ǫ c .In contradistinction to that, Newton’s method convergence for ǫ ∈ [0 , . , ex-cept near the switchback points ǫ = ǫ c and ǫ = ǫ ′ c , where (cid:16) dTdǫ (cid:17) − = 0 (see Fig-ure 3). As expected, the convergence of Newton’s iterations considerably slowsdown as | ǫ − ǫ c | or | ǫ − ǫ ′ c | becomes small, and eventually, close enough to a givenswitchback point, the iterations diverge (the Jacobians degenerate at the switchbackpoints). Other than that, the method shows robust convergence. Specifically, themethod converges when the solutions are non-unique: When the initial guess wasclose to one of the points B, C, or D in Figure 3, which correspond to ǫ = 0 . inside the first region of nonuniqueness, the method converged to the respectivediscrete solution. This indicates that if the grid is sufficiently fine to tell betweentwo close solutions inside the switchback (see Section 5.3.1), Newton’s method hasan adequate domain of convergence. Similar results were obtained for ǫ = 0 . ,which is in the middle of the second switchback region . . ǫ . . .29ven at the highest nonlinearity we have tried, ǫ = 3 ≈ ǫ c , Newton’s iterations stillconverge. The value ǫ = 3 is well beyond the first switchback. Moreover, it is anextremely high nonlinearity in two respects: First, for ǫ = 3 there are distinct, i.e.,nonunique solutions (we tested the convergence to the highest power solution). Sec-ond, at this value the nonlinear response is so large that it would cause a breakdownand ionization of the actual physical material in an experimental setting, which ren-ders the original Kerr model inapplicable. We note that we did not observe in oursimulations any convergence deterioration of Newton’s method around ǫ = 3 com-pared with smaller ǫ ’s, it only requires 3 to 6 iterations to drive the residual down by10 orders of magnitude. We thus assume that most likely Newton’s method wouldhave converged for much higher values of ǫ as well, although it has not been triedbecause of the physical irrelevance.As normally expected from Newton’s method, the iterations converge rapidly, ata quadratic rate. In most of the cases that we studied in this section, it took 4–6iterations to reduce the original residual by 9 to 11 orders of magnitude. As hasbeen mentioned, the only situation when Newton’s convergence may noticeablyslow down is for ǫ near the switchback points ǫ c and ǫ ′ c , where the tangent to thecurve T ( ǫ ) becomes vertical, see Figure 3. Convergence of the iteration scheme (39)which is based on freezing the nonlinearity, is much slower. It is estimated as linearbased on experimental evidence, and on the fact that it can be interpreted as a fixed-point iteration. Having seen that Newton’s method is locally convergent, we would like to testits performance for initial guesses that are not necessarily close to the solution.Our first observation is that when the solution to the linear problem is used as theinitial guess, i.e., E (0) = e ik z , then the iterations converge for ≤ ǫ ≤ . , anddiverge for ǫ > . . Therefore, for larger values of ǫ , we employ a continuationheuristics, increasing the nonlinearity in a series of increments (see Section 3.4). We emphasize that this version of the method uses no a priori knowledge of thesolutions sought for.
For the experiments in this section, we use the compact fourthorder scheme (23a).In order to quantify the performance of the continuation heuristic, we tested,for initial values of ǫ i in [0 , . , the ranges of allowable positive increments dǫ = ǫ f − ǫ i for which Newton’s method would still converge. In other words,for each pair ( ǫ i , dǫ > we applied Newton’s method for the NLH with non-linearity ǫ f = ǫ i + dǫ, and with the initial guess E (0) given by the solution with ǫ = ǫ i . The results are displayed in Figures 4 and 5. Several observations can be We are primarily interested in the positive increments because our key objective for em-ploying the continuation strategy is to obtain suitable initial guesses for Newton’s methodwhen it is applied to high energy cases. dǫ demonstrate a rather irregular behavior. It is most im-portant however, that the allowable values of dǫ do not decrease to zero , so thatthe continuation strategy for Newton’s method can traverse through the first andsecond switchback regions. ε d ε A ε d ε B Figure 4. A) The allowable positive increments dǫ = ǫ f − ǫ i for which the continuationmethod works, i.e., for which Newton’s method converges; ν ≡ , k = 8 and Z max = 10 .The iterations were defined as converged if the residual decreased by a factor of in20 iterations. B) Same as (A), for Newton’s method with relaxation (38), with ω = 0 . .The iterations were defined as converged if the residual decreased by a factor of in 60iterations. |T| ε . d ε A ε d ε |T| B Figure 5. A) Same as (4A), plotted together with the transmittance T ( ǫ ) (see Figure 2). B)Same as (A), zooming on the first switchback region The previous tests were rerun with the relaxation version (38) of Newton’s method.The results are presented in Figure 4(B). One can see that the relaxation consider-ably increases the allowable increments, especially at the switchback regions.We next study the performance of Newton’s method inside the nonuniqueness re-gion. When the initial guess was the solution slightly before the first switchback31n the T ( ǫ ) curve, at point A in Figure 3, and the value of ǫ was chosen withinthe nonuniqueness region: ǫ = 0 . , the method converged to the lower branch ofthe switchback, i.e., to the solution denoted by B in Figure 3. Similarly, selectingthe initial guess slightly past the switchback, at point E in Figure 3, and again tak-ing ǫ = 0 . (negative increments are not shown in Figures 4 and 5), facilitatedconvergence to the higher branch of the switchback curve, i.e., to the solution D inFigure 3. This behavior agrees with the standard notion of a hysteresis loop for abistable device.It is also important to note that the continuation strategy for Newton’s method can“hop over” (at least) the first and second switchback regions, which is an efficientway of reaching the regions of high nonlinearity. For example, a transition acrossthe first switchback, from point A to point E in Figure 3, is possible by choosingthe solution at point A as the initial guess for computing the solution with the valueof ǫ that corresponds to E. In this context we should mention that the first twononuniqueness regions are rather narrow. For wider nonuniqueness regions thatcorrespond to larger values of ǫ , using a combination of the continuation in ǫ andrelaxation can be beneficial. Indeed, even though more iterations will be requiredfor convergence with relaxation, larger allowable increments dǫ = ǫ f − ǫ i will helptraverse those wider regions of nonuniqueness, see Figure 4(B). In this section, we consider the case of a homogeneous Kerr medium (see formula(4)). Hence, the discontinuities are only at z = 0 and z = Z max . The error of thesolutions computed with the new schemes (17a), (18a), and (23a), as well as thereference schemes (41) and (42), is reported in Table 2. All computations are doneusing Newton’s solver. The error is defined as the difference between the computedsolution and the closed form Chen and Mills solution [4], and is evaluated in themaximum ( l ∞ ) norm.The discrete approximations of Section 2 are designed to retain their order of ac-curacy in the presence of material discontinuities. In order to test this, for eachscheme we consider two cases, see Table 2. The first case, ν = 1 . , ǫ = 0 . ,corresponds to a small discontinuity at the boundary and a weak nonlinearity. Notethat the quantities √ ν − . and ǫ characterize the difference between thelinear and nonlinear indices of refraction inside and outside the medium, see Sec-tion 1.1. The second case, ν = 1 . , ǫ = 0 . , corresponds to a large discontinuityat the boundary and an O (1) nonlinearity ( ǫ/ν = 0 . ).In the first case, ν = 1 . , ǫ = 0 . , the computations can also be repeated us-ing the original iterative solver (39), because for this choice of parameters it still32 ǫ ˜ h ≡ hk · − · − . · − · − . · − error (cid:16) ˜ h (cid:17) Standard centered-difference O (cid:0) h (cid:1) discretization (41) . . - .
230 0 . . · − . · − . · ˜ h . . - - .
16 8 . · − . · − . · ˜ h Standard centered-difference O (cid:0) h (cid:1) discretization (42) . .
01 0 .
187 2 . · − . · − . · − . · − . · ˜ h + 0 . · ˜ h . . - .
15 0 .
093 5 . · − . · − · ˜ h + 0 . · ˜ h Finite-volume O (cid:0) h (cid:1) discretization (17a) . . - .
107 1 . · − . · − . · − . · ˜ h . . - . · − . · − . · − . · − . · ˜ h Alternative finite-volume O (cid:0) h (cid:1) discretization (18a) . . - .
109 1 . · − . · − . · − . · ˜ h . . - - . · − . · − . · − . · ˜ h Finite-volume O (cid:0) h (cid:1) discretization (23a) . .
01 0 .
121 1 . · − . · − . · − . · − . · ˜ h . . - . · − . · − . · − . · − . · ˜ h Table 2Error for the 3 schemes of Section 2 and 2 reference schemes of Section 5.1; Z max = 10 , k = 8 . The entries are empty for cases wherein Newton’s iteration diverged. converges. Having done that, we determined that the accuracy of the correspond-ing solution was the same as the accuracy of the solution obtained using Newton’smethod. This indicates that the errors presented in Table 2 are indeed the approxi-mation errors of the discrete schemes and should not be attributed to the solver. Forthe case with higher nonlinearity, ǫ = 0 . , only Newton’s iterations converged.The functional dependence of the error on the dimensionless grid resolution ˜ h = k h = k Z max M is shown in the rightmost column of Table 2. It is obtained bya weighted least squares fit. Considering the reference methods of Section 5.1, thethree-point central-difference approximation (41) displays a second order conver-gence. The five-point central-difference approximation (42), however, displays afourth order convergence for (relatively) low grid resolutions and small discontinu-ities. For high grid resolutions and large discontinuities, however, its accuracy dete-riorates and shows a second order convergence. The limited ability of the referencemethods to handle discontinuities is also reflected by the fact that the actual errors,and hence the coefficients in front of ˜ h and ˜ h increase substantially for largerdiscontinuities. As mentioned in Section 1.1, the five-node discretization (42) is33articularly sensitive to the presence of discontinuities.On the other hand, the errors of the new second order discretizations (17a)and (18a), as well as that of the fourth order scheme (23a), are hardly affectedby the increase of the discontinuity. Indeed, for larger discontinuities at the bound-ary, the improvement over (41) ranges from a factor of 4 for (18a) to a factor of 10for (17a). The improvement of (23a) over (42) is even more substantial.We also see that (17a) yields better accuracy (smaller errors) than (18a). Indeed,intuitively one can expect that the integration of the interpolation of | E | E willapproximate R | E | Edz better than the integration of the interpolation of E cubed.However, as we do not have d ( | E | E ) dz or d ( | E | E ) dz , the discretization (17a) cannot beextended to fourth order accuracy. There, perhaps, could be other approaches, suchas the interpolation of the amplitude and phase of E . They, however, do not providean obvious venue to the fourth order either.Regarding the new fourth order discretization (23a), we can see from Table 2 that itis indeed fourth order accurate for both small and large material discontinuities. Aminor increase of the error for larger ǫ can be observed, which is natural to expectfor solutions with sharper variations. Altogether, for the cases reported in Table 2scheme (23a) has proven up to 6000 times more accurate than the standard five-node central-difference scheme (42). -2 -1 normalized grid size h*k -9 -6 -3 |E-E (h) | ν =1.01 , 5-pt FD ν =1.01 , new FV ν =1.3 , 5-pt FD ν =1.3 , new FV Figure 6. Computational error as a function of the grid size for schemes (23a) [labeled “newFV”] and (42) [labeled “5-pt FD”].
To provide a more descriptive and more intuitive account of our grid convergenceresults, we present a log–log plot of the error as it depends on the grid size forour fourth order schemes, see Figure 6. The data used for Figure 6 are the same34s those in Table 2. Once can clearly see that the accuracy of the original fourthorder scheme of [11,12] deteriorates on fine meshes, whereas the new scheme (23a)maintains its fourth order.Finally, we compare the minimum grid resolutions required by different schemes(second and fourth order) to distinguish between the solutions inside the regionof nonuniqueness (points B, C, and D in Figure 3) and thus enable convergenceof Newton’s iterations. As could be expected, the fourth order scheme used forcomputations of Section 5.2.2 took roughly ten times fewer points per wavelengththan the second order scheme used for computations of Section 5.2.1.
Here, we apply Newton’s method along with the fourth order scheme (23a) to solvethe NLH for a piecewise-constant material. The configuration is that of a two-layerKerr slab: ν ( z ) = . z ∈ [0 , . , z ∈ (5 , , ǫ ( z ) = . z ∈ [0 , . , z ∈ (5 , . (43)The material coefficients are therefore discontinuous at z = 5 , as well as at theboundaries z = 0 and z = 10 . The value of the linear wavenumber is k = 8 .The computed solution is compared with the closed form solutions obtained byChen and Mills in [5]. The results are given in Table 3; they corroborate the de-signed fourth order accuracy of the method. ˜ h ≡ hk · − · − . · − · − . · − error (cid:16) ˜ h (cid:17) . · − . · − . · − . · − . · − . · ˜ h Table 3Error for the two-layered configuration (43). The finite-volume O (cid:0) h (cid:1) discretization (23a)was used in combination with Newton’s method for k = 8 . Having addressed the issues of convergence and accuracy, we would also like tocomment on the numerical efficiency of our method.35 rid dimension M . . CPU time (sec) .
105 0 .
333 1 .
06 3 .
39 11 . Table 4Mean CPU times for a single Newton’s iteration of the finite volume scheme (23a) on AMDAthlon64 at 2200MHz in Matlab 7.3.0 under Linux.
The CPU times for one Newton’s iteration summarized in Table 4 clearly indicatethat the complexity scales linearly as a function of the grid dimension. Moreover,as the number of Newton’s iterations required to obtain the solution typically doesnot depend on the grid dimension (see Section 5.2), we can say that the overallcomplexity of the proposed method also depends linearly on the grid. Of course,it is natural to expect that the methods based on shooting [4, 5, 7–10] will performfaster for a one-dimensional problem than our method that involves a full fledgedapproximation of the boundary value problem for the NLH. The shooting-basedmethods, however, will not generalize to multiple space dimensions.
In this study, we approximated the one-dimensional NLH using new compact finitevolume schemes of orders four and two (the latter predominantly for reference pur-poses). The fourth order approximation of nonlinear terms was the most challeng-ing part, and it required the use of Birkhoff-Hermite interpolation (see Lemma 1and Appendix C). For actual implementation, the automation of the computation ofthe coefficients of the scheme was crucial (see Appendix D). Note that as we haveinterpolated the field using third degree polynomials, the cubic nonlinearity insideevery cell was represented by the polynomials of a relatively high degree — degree9. We, however, have not seen any adverse implications of that in our simulations.Let us also mention that the piecewise interpolating polynomial of the Birkhoff-Hermite type is not, generally speaking, equivalent to the standard Schoenbergcubic spline (see, e.g., [24, Section 2.3.2]). Indeed, the Schoenberg spline takesonly the nodal values of the interpolated function as input, and is built so that itsfirst and second derivatives are continuous at the nodes. In doing so, the equationsfor the coefficients of the spline become coupled across the entire grid. In contradis-tinction to that, all individual polynomials of the Birkhoff-Hermite interpolation arebuilt independently of one another inside their respective cells. Moreover, the nodal This is a function on [0 , Z max ] that on every grid cell coincides with the correspondingcubic polynomial obtained by Lemma 1. O ( h ) , because inside every cell the first derivative is approximated withthird order accuracy.Note also that as an alternative to the integral formulation (Section 2.1) and the fi-nite volume scheme built uniformly across the entire domain, one could have usedcompact finite differences on the regions of smoothness coupled with the condi-tion of continuity of the first derivative at the interfaces. The latter can be built intothe scheme, say, via one-sided differences. This approach, however, is not equiva-lent to ours and may, in our opinion, come short at least along the following twolines. First, the uniformity of the approach will be lost — another discontinuityintroduced in the domain will require special treatment and hence the scheme willhave to be rebuilt. Second, compactness of the approximation will be compromisedbecause of the long one-sided stencils near the discontinuities and as such, the re-sulting matrix will have a higher bandwidth.The non-reflecting two-way artificial boundary conditions (Section 2.5) were de-signed similarly to our previous work [11–13]; they are based on the analysis ofthe waves governed by the discrete equation. The boundary conditions prescribethe impinging wave that drives the problem and at the same time enable the prop-agation of all the outgoing waves. The important difference compared to [11, 12]is, however, that for a compact scheme, even fourth order accurate, it is sufficientto consider only one ghost node outside the computational domain, whereas forthe five-point central-difference approximation (42) we had to introduce two ghostnodes. Indeed, for the linear homogeneous five-point discretization, an additionalevanescent mode always exists, which needs to be handled with care, see [11, 12].For the compact three point discretization, however, no such mode exists, and theconstruction of the boundary-condition is greatly simplified. In both cases, the ad-ditional assumption that we used when calculating the value of the solution at theghost nodes is that outside the domain of interest the field is governed by the linearconstant coefficient Helmholtz equation.The analysis in the paper establishes the formal accuracy of our schemes (i.e., itis the analysis of consistency). We do not, however, derive any rigorous error esti-mates because the problem is nonlinear. Instead, we study the computational errorexperimentally (see Section 5.3). By comparing our numerical solutions with theclosed form solutions of [4, 5], we have been able to demonstrate that in all thecases our schemes possess the design rate of grid convergence. Besides, we pro-vide a convergence proof for a linear layered medium (see Appendix B).Our nonlinear solver for the discretized NLH exploits Newton’s iterations. How-ever, the nonlinearity in equation (9a) is nondifferentiable in the sense of Frechét37or complex-valued solutions E . Therefore, we present a convenient mechanism fortransforming the nonlinear systems of equations to the representation in real vari-ables, which enables Newton’s linearization. The results fully justify this additionaleffort. Indeed, Newton’s iterations allow us to solve the NLH for very high nonlin-earities, addressing the full range of nonlinearities interesting from the standpointof physics, and beyond, to the level of the actual material breakdown. Moreover,even though Newton’s method has been applied to problems with Kerr nonlinear-ity previously [17], our current implementation is particularly well suited for theone-dimensional NLH as it yields block tridiagonal Jacobians.We now compare our current work with other studies available in the literatureon the numerical solution of boundary value problems for the NLH: our previouswork [11–13], and Suryanto et al. [14, 15]. In terms of discrete approximations,these previous studies did not guarantee a fourth order approximation for materi-als with discontinuities. We have shown this directly for the discretization of ourwork [11–13] (Section 5.3.1), while for Suryanto’s finite element discretizationwhich accounts for discontinuities, the nonlinearity is approximated only to thesecond order. We again emphasize that to the best of our knowledge, the currentmethod is the first ever high-order approximation of the NLH with material discon-tinuities . The additional improvement is due to the Newton’s solver. All iterativeschemes used previously were based on freezing the nonlinearity [11–15]. As wehave seen, this freezing approach cannot be used beyond a certain ǫ threshold, un-related to the uniqueness of the solutions. We note that the freezing approach wasused by Suryanto et al. to solve the NLH for the cases when the solution is notunique. They report, however, that their setup was that of a highly-grated materialwith a defect, and that often for such setups the threshold for non-uniqueness ismuch lower (in fact, lowering of the threshold was one of the goals in [14, 15]).This is in agreement with our own observations: The freezing approach fails at acertain nonlinearity threshold unrelated to the solution uniqueness. Therefore, theresults show that Newton’s method, compared with the commonly used freezingapproach, allows for the much high levels of nonlinearity. Apparently, this is thefirst numerical method for the NLH that works at such high nonlinearities. To sum-marize, compared to [11–15], the approach of the current paper enables efficientdiscrete approximation for a problem with material discontinuities and allows so-lution for high levels of nonlinearity. We expect that it will provide a basis for thefuture extension to the case of multiple space dimensions (see Section 6.2).Let us also mention a few additional studies that have something in common but arenot as close to the current work. In [25], Choi and McKenna analyzed a somewhatdifferent equation: ∆ u + u = 0 . They could employ the mountain pass ideasbecause the boundary condition was homogeneous Dirichlet and hence u ≡ wasa solution. This approach, however, will not apply to the NLH, which is normallyto be driven by a given incoming wave at the boundary.In yet another series of papers, Kriegsmann and Morawetz solve a linear Helmholtz38quation with variable coefficients [26] and a focusing NLH [27] in two spacedimensions, and then Bayliss, Kriegsmann and Morawetz consider a defocusingNLH [28]. They employ second order approximations, and the solver is based onintegration in real time (i.e., using the wave equation) and applying the principleof limiting amplitude. The problem they solved is very different though, so at themoment we cannot compare their method with ours. The method can be extended to the case of a quintic nonlinearity, σ = 2 . This willinvolve evaluation of the fifth order tensor coefficients [cf. formula (22)]: g ijklm = Z F i F j F k F l F m dζ . This is a straightforward, though tedious extension, for which the automatic gener-ation of tensor elements will be a necessity.The method can also be extended to the case of piecewise smooth material coef-ficients ν ( z ) and ǫ ( z ) , as opposed to only piecewise constant coefficients that wehave analyzed in the paper. Approximating the quantities ν ( z ) and ǫ ( z ) by cubicpolynomials within each grid cell: ν ( z m ± ζ h ) = X j =0 c ± j ζ j , ǫ ( z m ± ζ h ) = X k =0 d ± k ζ k , one can then substitute these approximations into formulae (19) for one-sided sec-ond derivatives, and into the definitions (22) for the coefficients f i and g ijk .Likewise, linear and nonlinear absorption can be modeled by allowing the materialcoefficients to become complex. Note, however, that in this case the tensor elements g ijk will also become complex, g ijk = Z F ∗ i F j F k dζ , and will lose their symmetry with respect to the indices i, j, k .Furthermore, for the linear Helmholtz equation, which corresponds to the case ǫ ≡ in this paper, the scheme we have used to approximate (12) to fourth or-der accuracy can be extended to arbitrarily high orders at virtually no computa-tional cost. For example, using the first three even (one-sided) derivatives at z m : { E m , E ′′ m + , E (4) m + } and at z ( m +1) − : { E m +1 , E ′′ ( m +1) − , E (4)( m +1) − } , one can constructthe Birkhoff-Hermite quintic polynomial: P (cid:16) ζ ; E m , E ′′ m + , E (4) m + , E m +1 , E ′′ ( m +1) − , E (4)( m +1) − (cid:17) , E ( z m + ζ h ) = P ( ζ ) + O (cid:16) h (cid:17) , and then use it to approximate the integrals in (12). The values of the one-sidedderivatives are again obtained from the equation: E (4) = − k νE ′′ = k ν E , etc.Note that this extension cannot be used for the NLH.From the standpoint of physics, a very useful extension could be that of consideringthe vectorial NLH, when no assumption of the linear polarization of the field ismade. Building boundary conditions for this case may require special care.On the numerical side, to improve the quality of approximation a nonuniform gridcan, in principle, be used that would be better suited for resolving sharp variationsof the solution. In general, however, the structure of the solution is not known aheadof time, and therefore, a methodology of this type can only be adaptive.Improvements can also be introduced aimed at reducing the CPU time for themethod proposed in this paper. For example, the summation P g ijk was performedin Sections 2.4 and 3.3 without using the symmetry of the tensor g ijk , see Table 1.Taking it into account could decrease the cost of constructing the Jacobians. Webelieve, however, that it can only benefit the one-dimensional problem because in2D the overall cost will most likely be dominated by the inversion of the Jacobian.The extension of utmost interest to us, from the standpoint of both theory and ap-plications, is to the multidimensional case, see equation (1). It is well known thatunder the paraxial approximation, the NLH reduces to the nonlinear Schrödingerequation, which possesses singular solutions. Therefore, the question of global ex-istence for the solutions of the NLH in similar configurations is of a substantialmathematical and physical interest (see [11–13] and the bibliography there for moredetail). Currently, the only analytical result in this regard is due to Sever [29], whoproved existence for the NLH with real Robin boundary conditions. However, theradiation boundary conditions which model the physical problem do not lead tolinearized self-adjoint formulations. Hence, the question of global existence in thiscase remains outstanding.We re-emphasize that none of the shooting-type methods [4, 5, 7–10] that are ap-parently faster than ours in 1D can be generalized to multiple space dimensions.Hence, the only viable option in multi-D is to approximate on the grid and solvenumerically the boundary value problem for the NLH. Construction of a compactfinite volume discretization in multi-D is possible, although it will not be an auto-matic generalization of what has been done in the 1D case. The use of Newton’smethod will be of foremost importance, because as we have seen, a simpler itera-tion scheme has severe convergence limitations. Hence, the key contribution to theoverall computational cost in multi-D will be from the inversion of the Jacobians— large, sparse, non-Hermitian matrices. The use of direct solvers does not seemfeasible for those dimensions that will provide a sufficiently fine grid resolution.40he only viable alternative is the preconditioned Krylov subspace iterations, andas such, finding a good preconditioner will be in the focus of the study. Besides,convergence of Newton’s iterations slows down if two solutions in the region ofnonuniqueness are close to one another, such as near the switchback points in Fig-ure 3. In the multidimensional case, however, we do not know the structure of theNLH solutions ahead of time. Thus, numerical experiments will play a key role forfine-tuning the method. A Continuity conditions at material interfaces
For a linearly polarized plane wave that impinges normally on the interface z =const , we may assume without loss of generality that the electromagnetic field hasthe form: E = [ E , , , H = [0 , H , . The tangential component of the electric field must be continuous across the inter-face, see, e.g., [30, 31]. In our case, this implies the continuity of E ≡ E ( z ) . Thesame is also true for the tangential component of the magnetic field H ≡ H ( z ) ,which we do not consider explicitly in the current framework. The continuity of H , however, allows us to establish another important condition for E . The time-harmonic form of the Faraday’s law (a part of the Maxwell system of equations)reads: − iωµc H = curl E , and taking into account that E ≡ we have: − iωµc H = ∂E ∂z − ∂E ∂x = ∂E ∂z . Then, disregarding all possible magnetization effects, i.e., assuming that the mag-netic permeability is equal to 1 (which is certainly legitimate for optical frequen-cies), we obtain that the first derivative of the electric field ∂E ∂z ≡ dE ( z ) dz is alsocontinuous across any interface z = const , and hence everywhere. B Error estimate in the linear case
For a linear medium ( ǫ ≡ ) and piecewise constant refraction index ν ( z ) , we willshow that the fourth order scheme (23a) indeed converges with the design rate of O ( h ) as h −→ . Let ν ( z ) be a step function: ν ( z ) = ν left = 1 , z < ν right = 1 , z > , z ∈ [ − Z max , Z max ] . E inc = e i √ ν left k z ≡ e ik z , and letit satisfy the boundary conditions [cf. formulae (9b)]: ik + ddz ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = − Z max = 2 ik , ik − ddz ! E (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) z = Z max = 0 , where k = √ ν right k .Then, the continuous solution of the problem is E ( z ) = e ik z + Re − ik z , z < ,T e ik z , z > , (B.1)where the transmission and reflection coefficients are given by T = 21 + q ν right /ν left and R = 1 − q ν right /ν left q ν right /ν left . On the uniform grid z m = mh , m = 0 , ± , ± , . . . , we define: ν m = ν left , m < ,ν right , m ≥ . The fourth order discretization (23a) then reduces to L ( ν m − ) E m − − ( L ( ν m − ) + L ( ν m )) E m + L ( ν m ) E m +1 = 0 , (B.2)where L ( ν ) = ˜ h − − ν − ν ˜ h and L ( ν ) = ˜ h − + 16 ν + 7384 ν ˜ h , and ˜ h = hk . For m < and, independently, for m > , the fundamental set ofsolutions of the difference equation (B.2) is { q mν , q − mν } , where q ν and q − ν are rootsof the characteristic equation L ( ν ) q − L ( ν ) + L ( ν ) q − = 0 , and ν = ν left or ν right , respectively. The roots are given by the following expressions: q ν = L /L + i q − ( L /L ) , q − ν = q ∗ ν . The Taylor expansion yields: q ν = e i √ ν ˜ h + O (˜ h ) , which means that q mν approx-imates the right-traveling wave e i √ νk z m ≡ e i √ νk hm , while its conjugate q − mν ap-proximates the left-traveling wave e − i √ νk z m ≡ e − i √ νk hm , with fourth order accu-racy on any finite interval of the independent variable z : max m | q mν − e i √ νk z m | ≤ const · h , max m | q − mν − e − i √ νk z m | ≤ const · h . (B.3)42imilarly to (B.1), the discrete solution of the problem is constructed in the form: E m = q m left + R ( h ) q − m left , m ≤ ,T ( h ) q m right , m ≥ , (B.4)where q left def = q ν for ν = ν left , q right def = q ν for ν = ν right , and the reflection andtransmission coefficients are obtained from the condition of continuity at m = 0 : R ( h ) = T ( h ) , (B.5a)and from the difference equation (B.2) at m = 0 , which reads: L ( ν left )( q − + R ( h ) q left ) − ( L ( ν left ) + L ( ν right )) T ( h ) + L ( ν right ) T ( h ) q right = 0 . (B.5b)Solving the system of equations (B.5) with respect to R ( h ) and T ( h ) and using theTaylor expansion of the resulting solution, one can show that R ( h ) = R + O ( h ) and T ( h ) = T + O ( h ) . These relations, along with estimates (B.3), imply that thediscrete solution E m of (B.4) converges to the continuous solution E ( z ) of (B.1)with the rate O ( h ) as h −→ . C Birkhoff-Hermite interpolation (proof of Lemma 1)
A large body of work has been done by different authors on Birkhoff-Hermite in-terpolation. Nonetheless, for the completeness of our analysis we present an ele-mentary convergence proof in the case of cubic polynomials. It is self-containedand does not require any additional facts from the literature.It will be convenient to make the change of variables: x = z − z m + , so that x ∈ h − h , h i . With respect to the new coordinate x , the material discontinuitiesare allowed at x ± h , whereas on the interval ( − h , h ) and, in particular, at the cellcenter x = 0 , the solution is C ∞ .A cubic polynomial P ( x ) that satisfies P (cid:16) ± h (cid:17) = E ± h and P ′′ (cid:16) ± h (cid:17) = E ′′± h is P ( x ) = (cid:18) − xh (cid:19) E − h − h E ′′− h ! + h E ′′− h (cid:18) − xh (cid:19) + (cid:18)
12 + xh (cid:19) E h − h E ′′ h ! + h E ′′ h (cid:18)
12 + xh (cid:19) . (C.1)43t is unique since the four parameters E ± h , E ′′± h uniquely determine the four coef-ficients c j of P ( x ) = P j =0 c j x j via the solution of the corresponding × linearsystem.Next, we prove that the polynomial P ( x ) is indeed a fourth order approximationof the field E ( x ) . Differentiating P ( x ) of (C.1) three times at x = 0 , we have: P (0) = E h + E − h − h E ′′ h + E ′′− h , P ′′ (0) = E ′′ h + E ′′− h ,P ′ (0) = E h − E − h h − h E ′′ h − E ′′− h h , P (3)3 (0) = E ′′ h − E ′′− h h . Then, expressing E ± h and E ′′± h with the help of the Taylor formulae for E ( x ) and E ′′ ( x ) at x = 0 , we obtain: P (0) = E (0) + h E ′′ (0) + O (cid:16) h (cid:17)! − h (cid:16) E ′′ (0) + O (cid:16) h (cid:17)(cid:17) = E (0) + O (cid:16) h (cid:17) ,P ′ (0) = E ′ (0) + h E (3) (0) + O (cid:16) h (cid:17)! − h (cid:16) E (3) (0) + O (cid:16) h (cid:17)(cid:17) = E ′ (0) + O (cid:16) h (cid:17) ,P ′′ (0) = E ′′ (0) + O (cid:16) h (cid:17) ,P (3)3 (0) = E (3) (0) + O (cid:16) h (cid:17) . (C.2)Since P ( x ) is a cubic polynomial, we can write: P ( x ) = X k =0 P ( k )3 (0) k ! x k . Moreover, as E ( x ) is smooth on (cid:16) − h , h (cid:17) , the Taylor formula yields: E ( x ) = X k =0 E ( k ) (0) k ! x k + O ( h ) , x ∈ − h , h ! . Hence, using equalities (C.2), we obtain: P ( x ) − E ( x ) = X k =0 P ( k )3 (0) − E ( k ) (0) k ! x k + O (cid:16) h (cid:17) = O (cid:16) h (cid:17) , x ∈ − h , h ! . Software engineering
Although calculating the coefficients g ijk in (22) is straightforward, it is a te-dious and error prone task. As such, it is a natural choice for automation. Note thatautomation will become an absolute necessity should we wish to extend the methodof this paper, say, to a quintic nonlinearity , or a multidimensional setting.In the current paper, we developed simple scripts which automate the calculationof the constants g ijk . The general approach is to use a template file to generate adifferent Maple script for each coefficient. For i, j, k = 0 , . . . , , Maple’s symbolicutilities calculate the function g ijk ( ν, ˜ h ) , while its code generation utilities thengenerate the required Matlab function to evaluate the expression.The scripts are available under the GPL license at the following URL: ∼ guybar/1DNLH. References [1] M. Born, E. Wolf, Principles of Optics: Electromagnetic Theory of Propagation,Interference and Diffraction of Light, 7th edition, Cambridge University Press,Cambridge, 1999.[2] H. Wilhelm, Analytical solution of the boundary-value problem for the nonlinearHelmholtz equation, J. Math. Phys. 11 (1970) 824–826.[3] J. H. Marburger, F. S. Felber, Theory of a lossless nonlinear Fabry-Perotinterferometer, Phys. Rev. A 17 (1978) 335–342.[4] W. Chen, D. L. Mills, Optical response of a nonlinear dielectric film, Phys. Rev. B 35(1987) 524–532.[5] W. Chen, D. L. Mills, Optical response of nonlinear multilayer structures: Bilayersand superlattices, Phys. Rev. B 36 (1987) 6269–6278.[6] R. Knaap, G. Papanicolau, B. White, Nonlinearity and localization in one-dimensionalrandom media, Springer Proceedings in Physics 39 (1989) 2–26.[7] H. V. Baghdasaryan, T. M. Knyazyan, Problem of plane EM wave self-action inmultilayer structure: an exact solution, Optical and Quantum Electronics 31 (9–10)(1999) 1059–1072.[8] M. Midrio, Shooting technique for the computation of plane-wave reflection andtransmission through one-dimensional nonlinear inhomogeneous dielectric structures,J. Opt. Soc. Am. B — Opt. Phys. 18 (12) (2001) 1866–1871.[9] P. K. Kwan, Y. Y. Lu, Computing optical bistability in one-dimensional nonlinearstructures, Optics Coomunications 238 (1–3) (2004) 169–175.
10] J. Petráˇcek, Frequency-domain simulation of electromagnetic wave propagation inone-dimensional nonlinear structures, Optics Coomunications 265 (1) (2006) 331–335.[11] G. Fibich, S. V. Tsynkov, High-order two-way artificial boundary conditions fornonlinear wave propagation with backscattering, J. Comput. Phys. 171 (2001) 632–677.[12] G. Fibich, S. V. Tsynkov, Numerical solution of the nonlinear Helmholtz equationusing nonorthogonal expansions, J. Comput. Phys. 210 (2005) 183–224.[13] G. Baruch, G. Fibich, S. Tsynkov, Numerical solution of the nonlinear Helmholtzequation with axial symmetry, Journal of Computational and Applied Mathematics204 (2) (2007) 477–492.[14] A. Suryanto, M. E. van Groesen, H.J.W.M. Hoekstra, A finite element scheme to studythe nonlinear optical response of a finite grating without and with defect, Opt. andQuant. Elec. 35 (2002) 313–332.[15] A. Suryanto, E. van Groesen, M. Hammer, Finite element analysis of optical bistabilityin one-dimensional nonlinear photonic band gap structures with a defect, J. NonlinearOpt. Phys. and Materials 12 (2003) 187–204.[16] I. Singer, E. Turkel, High-order finite difference methods for the Helmholtz equation,Comput. Methods Appl. Mech. Engrg. 163 (1-4) (1998) 343–358.[17] J. Gómez-Gardeñes, L. M. Floría, M. Peyrard, A. R. Bishop, NonintegrableSchrödinger discrete breathers, Chaos 14 (4) (2004) 1130–1147.[18] S. Aubry, T. Cretegny, Mobility and reactivity of discrete breathers, Phys. D 119 (1-2)(1998) 34–46, localization in nonlinear lattices (Dresden, 1997).[19] T. Cretegny, S. Aubry, Spatially inhomogeneous time-periodic propagating waves inanharmonic systems, Physical Review B 55 (18) (1997) R11 929–932.[20] J. L. Marín, S. Aubry, Breathers in nonlinear lattices: numerical calculation from theanticontinuous limit, Nonlinearity 9 (6) (1996) 1501–1528.[21] R. S. MacKay, S. Aubry, Proof of existence of breathers for time-reversible orHamiltonian networks of weakly coupled oscillators, Nonlinearity 7 (6) (1994) 1623–1643.[22] P. G. Ciarlet, The Finite Element Method for Elliptic Problems, Vol. 40 of Classicsin Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM),Philadelphia, PA, 2002, reprint of the 1978 original [North-Holland, Amsterdam;MR0520174 (58
25] Y. S. Choi, P. J. McKenna, A mountain pass method for the numerical solution ofsemilinear elliptic problems, Nonlinear Anal. 20 (4) (1993) 417–437.[26] G. A. Kriegsmann, C. S. Morawetz, Solving the Helmholtz equation for exteriorproblems with variable index of refraction. I, SIAM J. Sci. Statist. Comput. 1 (3)(1980) 371–385.[27] G. A. Kriegsmann, C. S. Morawetz, Computations with the nonlinear Helmholtzequation, J. Opt. Soc. Amer. 71 (8) (1981) 1015–1019.[28] A. Bayliss, G. A. Kriegsmann, C. S. Morawetz, The nonlinear interaction of a laserbeam with a plasma pellet, Comm. Pure Appl. Math. 36 (4) (1983) 399–414.[29] M. Sever, An existence theorem for some semilinear elliptic systems, J. DifferentialEquations 226 (2) (2006) 572–593.[30] L. D. Landau, E. M. Lifshitz, Course of Theoretical Physics. Vol. 8, Electrodynamicsof Continuous Media, Pergamon Press, Oxford, 1984.[31] J. D. Jackson, Classical Electrodynamics, 3rd Edition, Wiley, New-York, 1998.25] Y. S. Choi, P. J. McKenna, A mountain pass method for the numerical solution ofsemilinear elliptic problems, Nonlinear Anal. 20 (4) (1993) 417–437.[26] G. A. Kriegsmann, C. S. Morawetz, Solving the Helmholtz equation for exteriorproblems with variable index of refraction. I, SIAM J. Sci. Statist. Comput. 1 (3)(1980) 371–385.[27] G. A. Kriegsmann, C. S. Morawetz, Computations with the nonlinear Helmholtzequation, J. Opt. Soc. Amer. 71 (8) (1981) 1015–1019.[28] A. Bayliss, G. A. Kriegsmann, C. S. Morawetz, The nonlinear interaction of a laserbeam with a plasma pellet, Comm. Pure Appl. Math. 36 (4) (1983) 399–414.[29] M. Sever, An existence theorem for some semilinear elliptic systems, J. DifferentialEquations 226 (2) (2006) 572–593.[30] L. D. Landau, E. M. Lifshitz, Course of Theoretical Physics. Vol. 8, Electrodynamicsof Continuous Media, Pergamon Press, Oxford, 1984.[31] J. D. Jackson, Classical Electrodynamics, 3rd Edition, Wiley, New-York, 1998.