A modified Kačanov iteration scheme with application to quasilinear diffusion models
AA MODIFIED KA ˇCANOV ITERATION SCHEME WITHAPPLICATION TO QUASILINEAR DIFFUSION MODELS
PASCAL HEID AND THOMAS P. WIHLER
Abstract.
The classical Kaˇcanov scheme for the solution of nonlinear varia-tional problems can be interpreted as a fixed point iteration method that up-dates a given approximation by solving a linear problem in each step. Basedon this observation, we introduce a modified Kaˇcanov method , which allows for(adaptive) damping, and, thereby, to derive a new convergence analysis undermore general assumptions and for a wider range of applications. For instance,in the specific context of quasilinear diffusion models, our new approach doesno longer require a standard monotonicity condition on the nonlinear diffusioncoefficient to hold. Moreover, we propose two different adaptive strategies forthe practical selection of the damping parameters involved. Introduction
In this article we focus on a novel iterative Kaˇcanov type procedure for thesolution of quasilinear elliptic partial differential equations (PDE) of the form − div (cid:8) µ (cid:0) |∇ u | (cid:1) ∇ u (cid:9) = f in Ω (1a) u = g on Γ (1b) − µ (cid:0) |∇ u | (cid:1) ∇ u · n = h on Γ , (1c)where Ω ⊂ R d , d ∈ { , } , is a non-empty, open, and bounded domain with aLipschitz boundary ∂ Ω. We suppose that ∂ Ω = Γ ∪ Γ is composed of a Dirichletboundary part Γ (cid:54) = ∅ (of non-zero surface measure) and a Neumann boundarypart Γ , and n denotes the unit outward normal vector on Γ . Moreover, µ is areal-valued diffusion coefficient, and g and h are Dirichlet and Neumann boundarycondition functions, respectively. Nonlinear equations of this type are widely usedin mathematical models of physical applications including, for instance, hydro-and gas-dynamics, as well as elasticity and plasticity, see, e.g., [HJS97] and thereferences therein; we further refer to [Zei88, § § u with u = g on Γ , the traditional Kaˇcanov schemefor the solution of (1) is given by − div (cid:8) µ (cid:0) |∇ u n | (cid:1) ∇ u n +1 (cid:9) = f in Ω (2a) u n +1 = g on Γ (2b) − µ (cid:0) |∇ u n | (cid:1) ∇ u n +1 · n = h on Γ ; (2c) Mathematical Institute, University of Oxford, Oxford, OX2 6GG, UKMathematics Institute, University of Bern, CH-3012 Bern, Switzerland
E-mail addresses : [email protected], [email protected] .2010 Mathematics Subject Classification.
Key words and phrases.
Quasilinear elliptic PDE, strongly monotone problems, fixed pointiterations, Kaˇcanov method, quasi-Newtonian fluids, shear-thickening fluids.The authors acknowledge the financial support of the Swiss National Science Foundation (SNF),Grant No. 200021 182524, and Project No. P2BEP2 191760. a r X i v : . [ m a t h . NA ] J a n MODIFIED KAˇCANOV METHOD this iteration scheme was originally introduced by Kaˇcanov in [Kaˇc59], in the con-text of variational methods for plasticity problems. Observing that the aboveboundary value problem is a linear PDE for u n +1 (given u n ), we can see Kaˇcanov’sscheme as an iterative linearization method . In the literature, standard assump-tions on the nonlinearity µ , which guarantee the convergence of (2), are expressedas follows:( µ
1) The diffusion function µ : [0 , ∞ ) → [0 , ∞ ) is continuously differentiable;( µ
2) The diffusion function µ is decreasing, i.e., µ (cid:48) ( t ) ≤ t ≥ µ
3) There are positive constants m µ and M µ such that m µ ≤ µ ( t ) ≤ M µ for all t ≥ µ
4) There exists a positive constant c µ > µ (cid:48) ( t ) t + µ ( t ) ≥ c µ forall t ≥ µ µ Novelty of the paper.
In this work we address the open question of whether themonotonicity assumption ( µ
2) is necessary or not for the convergence of the Kaˇcanovscheme. Indeed, numerical experiments in [GMZ11, HW20a] indicate that it maybe dropped. Based on this observation, in this work, we will introduce a mod-ified Kaˇcanov iteration method that converges without imposing condition ( µ = ∅ , upon defining the PDE residual F := − div (cid:8) µ (cid:0) |∇ ( · ) | (cid:1) ∇ ( · ) (cid:9) − f, and, for given u , the linear preconditioning operator A ( u )( · ) := − div (cid:8) µ (cid:0) |∇ u | (cid:1) ∇ ( · ) (cid:9) , the iterative procedure (2) can be written (formally) in terms of the fixed pointiteration u n +1 = u n − A ( u n ) − F ( u n ) , n ≥ . With the aim of obtaining an improved control of the updates in each step, weintroduce a step size parameter δ ( u n ) > u n +1 = u n − δ ( u n ) A ( u n ) − F ( u n ) , n ≥ . This yields the modified Kaˇcanov method proposed and analyzed in this work.
Outline.
We begin by deriving an appropriate framework for abstract nonlinearvariational problems in §
2. In particular, we introduce a modified version of theclassical Kaˇcanov iteration scheme, and prove a new convergence result under as-sumptions that are milder than in the classical setting. The purpose of § §
4, which also contains a numerical study within the frameworkof finite element discretizations. Finally, we add some concluding remarks in § Abstract analysis
Throughout, Y is a reflexive real Banach space, equipped with a norm denotedby (cid:107) · (cid:107) Y , and K ⊂ Y is a closed, convex subset. ODIFIED KAˇCANOV METHOD 3
Nonlinear variational problems.
Consider a (nonlinear) Gˆateaux contin-uously differentiable functional H : K → R that has a strongly monotone Gˆateauxderivative, i.e., there exists a constant ν > (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , u − v (cid:105) ≥ ν (cid:107) u − v (cid:107) Y ∀ u, v ∈ K, (3)where (cid:104)· , ·(cid:105) is the duality product on Y (cid:63) × Y . Proposition 2.1.
Suppose that H : K → R is a (Gˆateaux) continuously differen-tiable functional that satisfies the strong monotonicity condition (3) . Then, thereexists a unique minimizer u (cid:63) ∈ K of H in K , i.e., H ( u (cid:63) ) ≤ H ( v ) for all v ∈ K .Furthermore, u (cid:63) ∈ K is the unique solution of the weak inequality (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105) ≥ ∀ v ∈ K. (4) Proof.
We follow along the lines of the proof of [Zei90, Thm. 25.L].1. Since H is a Gˆateaux continuously differentiable functional on K , it is, in par-ticular, continuous on K . Moreover, from (3) we infer that H is strictly convex,see, e.g., [Zei90, Prop. 25.10]. These two properties, in turn, imply that H isweakly sequentially lower semicontinuous, see [Zei90, Prop. 25.20].2. If the set K is bounded, then the functional H , being weakly sequentially lowersemicontinuous, has a minimum on K , see [Zei90, Thm. 25.C]. Otherwise, if K is unbounded, then we show that H is weakly coercive. To this end, take any u, v ∈ K , and define the scalar function ϕ ( t ) := H ( u + t ( v − u )) , t ∈ [0 , K is convex, note that u + t ( v − u ) ∈ K for all t ∈ [0 , ϕ (cid:48) ( t ) = (cid:104) H (cid:48) ( u + t ( v − u )) , v − u (cid:105) . (6)Thus, by virtue of the fundamental theorem of calculus, we deduce that H ( v ) − H ( u ) = (cid:90) (cid:104) H (cid:48) ( u + t ( v − u )) , v − u ) (cid:105) d t = (cid:90) (cid:104) H (cid:48) ( u + t ( v − u )) − H (cid:48) ( u ) , v − u (cid:105) d t + (cid:104) H (cid:48) ( u ) , v − u (cid:105) . Therefore, exploiting (3), it follows that H ( v ) − H ( u ) ≥ ν (cid:107) v − u (cid:107) Y − (cid:107) H (cid:48) ( u ) (cid:107) Y (cid:63) (cid:107) v − u (cid:107) Y . Hence, we see that H ( v ) → ∞ for (cid:107) v (cid:107) Y → ∞ , i.e., H is weakly coercive on K .Then, owing to [Zei90, Thm. 25.D], we conclude that H has a minimum u (cid:63) in K . We note that this minimum is unique since H is strictly convex.3. Finally, if u (cid:63) is the minimum of H in K , then the function ϕ ( t ) from (5), with u = u (cid:63) , has a minimum at t = 0. This implies that ϕ (cid:48) (0) ≥
0. In turn,exploiting (6), this holds true if and only if (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105) ≥
0, which yields (4).Conversely, since H (cid:48) is strongly monotone, cf. (3), and u (cid:63) satisfies the weakinequality (4), the function ϕ ( t ) is increasing on [0 , t ∈ (0 , ϕ (cid:48) ( t ) = 1 t (cid:104) H (cid:48) ( u (cid:63) + t ( v − u (cid:63) )) − H (cid:48) ( u (cid:63) ) , t ( v − u (cid:63) ) (cid:105) + (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105)≥ t (cid:104) H (cid:48) ( u (cid:63) + t ( v − u (cid:63) )) − H (cid:48) ( u (cid:63) ) , t ( v − u (cid:63) ) (cid:105) ≥ νt (cid:107) v − u (cid:63) (cid:107) Y ≥ . Therefore, we obtain H ( v ) = ϕ (1) ≥ ϕ (0) = H ( u (cid:63) ) , i.e., u (cid:63) is the minimum of H in K since v was arbitrary.This completes the proof. (cid:3) MODIFIED KAˇCANOV METHOD
Remark 2.2.
We emphasize that the application of [Zei90, Thms. 25.C & 25.D]in the proof of Proposition 2.1 require the space Y to be reflexive. More precisely,these results make use of the fact that every bounded sequence in a reflexive Banachspace has a weakly convergent subsequence.Consider the following assumption on the closed and convex subset K :(K) The set X := { u − v : u, v ∈ K } is a linear closed subspace of Y , and x + y ∈ K for all x ∈ X and y ∈ K . Corollary 2.3.
Suppose that the subset K has property (K) , and let the assump-tions of Proposition 2.1 hold true. Then, the unique minimizer u (cid:63) ∈ K of H satisfies (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) = 0 ∀ v ∈ X. (7) Proof.
Let u (cid:63) ∈ K be the unique minimizer of H on K . Then, for any v ∈ X ,owing to property (K), it holds that v + u (cid:63) ∈ K . Thus, using (4), we have 0 ≤(cid:104) H (cid:48) ( u (cid:63) ) , ( v + u (cid:63) ) − u (cid:63) (cid:105) = (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) . Similarly, upon replacing v by − v , we inferthat 0 ≤ − (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) , which concludes the argument. (cid:3) Modified Kaˇcanov method.
We consider mappings a : K × Y × X → R and b : K × X → R , with X the linear subspace from property (K) above, whichsatisfy the following properties:(A1) For any given u ∈ K , we suppose that a ( u ; · , · ) is a bilinear form on Y × X ,and b ( u, · ) ∈ X (cid:63) ; in the sequel, we use the notation b ( u, · ) = (cid:104) b ( u ) , ·(cid:105) , wherethe dual product is evaluated on the space X (cid:63) × X .(A2) There exist positive constants α, β > u ∈ K , the form a ( u ; · , · ) is uniformly bounded on Y × X and coercive on X × X in the sensethat a ( u ; v, w ) ≤ β (cid:107) v (cid:107) Y (cid:107) w (cid:107) Y ∀ v ∈ Y, ∀ w ∈ X, (8)and a ( u ; v, v ) ≥ α (cid:107) v (cid:107) Y ∀ v ∈ X, respectively; in particular, if the set K satisfies property (K), then it followsthat a ( u ; v − w, v − w ) ≥ α (cid:107) v − w (cid:107) Y ∀ v, w ∈ K. (9)(A3) There are Gˆateaux continuously differentiable functionals G : K → R and B : K → R such that, for all u ∈ K , it holds G (cid:48) ( u ) | X = a ( u ; u, · ) and B (cid:48) ( u ) | X = b ( u ) in X (cid:63) .(A4) The (continuously differentiable) functional H : K → R given by H ( u ) := G ( u ) − B ( u ), u ∈ K , with G and B from (A3), satisfies the strong monotonicitycondition (3).If the closed and convex subset K ⊂ Y fulfils property (K), then the uniqueminimizer u (cid:63) ∈ K of the functional H from (A4) solves the weak formulation0 = (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) = (cid:104) G (cid:48) ( u (cid:63) ) − B (cid:48) ( u (cid:63) ) , v (cid:105) = a ( u (cid:63) ; u (cid:63) , v ) − (cid:104) b ( u (cid:63) ) , v (cid:105) ∀ v ∈ X, (10)cf. Corollary 2.3. Now, for given u ∈ K , define the linear operator A ( u ) : Y → X (cid:63) , v (cid:55)→ A ( u ) v , by (cid:104) A ( u ) v, w (cid:105) = a ( u ; v, w ) ∀ w ∈ X. Then, the weak formulation (10) can be expressed by A ( u (cid:63) ) u (cid:63) = b ( u (cid:63) ) in X (cid:63) . In light of (A2), for any u ∈ K , we notice that a ( u ; · , · ) is a bounded and coercivebilinear form on the closed subspace X . In particular, due to the Lax-Milgramtheorem, for any (cid:96) ∈ X (cid:63) , we conclude that there exists a unique w (cid:96) ∈ X such that ODIFIED KAˇCANOV METHOD 5 A ( u ) w (cid:96) = (cid:96) in X (cid:63) , i.e., A ( u ) | X : X → X (cid:63) is invertible for any u ∈ K . Hence,noticing that F ( u ) := H (cid:48) ( u ) = A ( u ) u − b ( u ) ∈ X (cid:63) , (11)the classical Kaˇcanov method in abstract form, for given u n ∈ K , is given by u n +1 = u n − ρ n , n ≥ , (12a)where ρ n ∈ X is uniquely defined through A ( u n ) ρ n = F ( u n ) in X (cid:63) . (12b)A modification of this procedure is obtained by invoking a parameter δ : K → (0 , ∞ ), thereby yielding the new scheme u n +1 = u n − δ ( u n ) ρ n , n ≥ , (13)with ρ n as in (12b). Equivalently, upon introducing the forms a δ ( u ; v, w ) := 1 δ ( u ) a ( u ; v, w ) , b δ ( u ) := 1 δ ( u ) a ( u ; u, · ) − F ( u ) , (14)for u ∈ K, v ∈ Y, and w ∈ X , we derive the modified Kaˇcanov iteration in weakform: u n +1 ∈ K : a δ ( u n ; u n +1 , v ) = (cid:104) b δ ( u n ) , v (cid:105) ∀ v ∈ X, n ≥ . (15)Clearly, for δ = 1, the traditional Kaˇcanov scheme (12) is recovered. Proposition 2.4.
Suppose that K satisfies property (K) . Then, for any initialguess u ∈ K , the modified Kaˇcanov iteration (15) remains well-defined for each n ≥ , i.e., for given u n ∈ K , the solution u n +1 ∈ K of the weak formulation (15) exists and is unique.Proof. For fixed u n ∈ K , the solution ρ n ∈ X of (12b) exists and is unique.Moreover, owing to property (K), we infer that u n +1 ∈ K in (13). (cid:3) Convergence analysis.
We are now in the position to state and prove themain result of our work.
Theorem 2.5.
Given the assumptions of Proposition 2.4. Moreover, suppose that H (cid:48) is Lipschitz continuous in the sense that there exists a constant L H > suchthat (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , w (cid:105) ≤ L H (cid:107) u − v (cid:107) Y (cid:107) w (cid:107) Y ∀ u, v ∈ K, w ∈ X. (16) If δ : K → [ δ min , δ max ] , with < δ min ≤ δ max < α / L H , then the damped Kaˇcanoviteration (13) converges to the unique solution u (cid:63) ∈ K of (7) for any initial guess u ∈ K . Remark 2.6.
The key observation in our work is that the introduction of a damp-ing factor in the modified scheme (13) allows to weaken the assumptions in the clas-sical convergence theorem for the Kaˇcanov method, see, e.g., [Zei90, Thm. 25.L].Specifically, the inequality G ( u ) − G ( v ) ≥
12 ( a ( u ; u, u ) − a ( u ; v, v )) ∀ u, v ∈ K, (17)cf. [Zei90, Eq. (106)], which plays a crucial role in the standard analysis, can bedropped; instead, the proof of Theorem 2.5 hinges essentially on the bound (20)below. Furthermore, in contrast to the traditional framework, the operator B fromassumption (A3) need not to be linear in our analysis. We remark that theseimprovements come at the expense of condition (K) as well as of the Lipschitzcontinuity condition (16); with regards to the application to quasilinear ellipticPDE (1), however, these assumptions do not implicate any drawback. MODIFIED KAˇCANOV METHOD
Our proof of Theorem 2.5 follows to some extent the lines of the proof of [Zei90,Thm. 25.L], however, with some major modifications due to the absence of the keyinequality (17) .
Proof of Theorem 2.5.
We proceed in several steps.1. We show that (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y vanishes as n → ∞ . To this end, as in the proofof Proposition 2.1, we define the scalar function ϕ ( t ) := H ( u n + t ( u n +1 − u n )),for t ∈ [0 ,
1] and n ≥
0. Then, we find that H ( u n +1 ) − H ( u n ) = (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) d t (18)= (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) − H (cid:48) ( u n ) , u n +1 − u n (cid:11) d t + (cid:10) H (cid:48) ( u n ) , u n +1 − u n (cid:11) . Combining (11) and (14), we note that H (cid:48) ( u ) = F ( u ) = a δ ( u ; u, · ) − b δ ( u ) ∀ u ∈ K (19)in X (cid:63) . Hence, by invoking the Lipschitz continuity (16), and upon exploitingthe modified Kaˇcanov scheme (15), we obtain H ( u n +1 ) − H ( u n ) ≤ L H (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y + a δ ( u n ; u n , u n +1 − u n ) − (cid:10) b δ ( u n ) , u n +1 − u n (cid:11) = L H (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y − a δ ( u n ; u n +1 − u n , u n +1 − u n ) . Furthermore, employing the coercivity assumption (9), it follows that H ( u n +1 ) − H ( u n ) ≤ (cid:18) L H − αδ ( u n ) (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y . Thus, since δ ( u n ) ≤ δ max < α / L H , we deduce the bound H ( u n ) − H ( u n +1 ) ≥ (cid:18) αδ max − L H (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y > , (20)for u n +1 (cid:54) = u n . In particular, we infer that { H ( u n ) } n ≥ is a decreasing sequencewhich is bounded from below by H ( u (cid:63) ), and thus converges. Therefore,0 ≤ (cid:18) αδ max − L H (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y ≤ (cid:0) H ( u n ) − H ( u n +1 ) (cid:1) → , as n → ∞ , i.e., lim n →∞ (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y = 0.2. We show that { u n } n ≥ converges to u (cid:63) . For this purpose, we will follow closelythe proof of [HW20a, Prop. 2.1]. By the strong monotonicity (3), for any m ≥ n ≥
0, it holds that ν (cid:107) u m − u n (cid:107) Y ≤ (cid:104) H (cid:48) ( u m ) − H (cid:48) ( u n ) , (cid:15) m,n (cid:105) , with (cid:15) m,n := u m − u n ∈ X . Invoking (19) and using (15), this leads to ν (cid:107) (cid:15) m,n (cid:107) Y ≤ a δ ( u m ; u m , (cid:15) m,n ) − (cid:104) b δ ( u m ) , (cid:15) m,n (cid:105)− a δ ( u n ; u n , (cid:15) m,n ) + (cid:104) b δ ( u n ) , (cid:15) m,n (cid:105)≤ a δ ( u m ; u m − u m +1 , (cid:15) m,n ) − a δ ( u n ; u n − u n +1 , (cid:15) m,n ) . Applying (8) yields ν (cid:107) (cid:15) m,n (cid:107) Y ≤ βδ min (cid:107) (cid:15) m,n (cid:107) (cid:0)(cid:13)(cid:13) u m − u m +1 (cid:13)(cid:13) Y + (cid:13)(cid:13) u n − u n +1 (cid:13)(cid:13) Y (cid:1) , ODIFIED KAˇCANOV METHOD 7 and thus (cid:107) (cid:15) m,n (cid:107) Y ≤ βνδ min (cid:0)(cid:13)(cid:13) u m − u m +1 (cid:13)(cid:13) Y + (cid:13)(cid:13) u n − u n +1 (cid:13)(cid:13) Y (cid:1) . From the first step of the proof, we conclude that { u n } n ≥ is a Cauchy sequencein K . Since K is a closed subset of a Banach space, the sequence { u n } n ≥ hassome limit u ∈ K .3. We show that u is a solution of (7), which is unique thanks to Proposition 2.1and Corollary 2.3. Indeed, from (19) and (15), for all w ∈ X , it follows that a δ ( u n ; u n +1 − u n , w ) + (cid:104) H (cid:48) ( u n ) , w (cid:105) = a δ ( u n ; u n +1 , w ) − (cid:104) b δ ( u n ) , w (cid:105) = 0 . Recalling that { u n +1 − u n } n ≥ is a vanishing sequence in X , and exploiting (8),we have a δ ( u n ; u n +1 − u n , w ) → n → ∞ , for any w ∈ X . Moreover, by theLipschitz continuity (16), we obtain (cid:104) H (cid:48) ( u ) , w (cid:105) = lim n → (cid:104) H (cid:48) ( u n ) , w (cid:105) = 0 ∀ w ∈ X, (21)i.e., u = u (cid:63) is the unique solution of (7).This completes the argument. (cid:3) Remark 2.7.
We can weaken the Lipschitz continuity assumption (16) in Theo-rem 2.5 by the condition (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , u − v (cid:105) ≤ L H (cid:107) u − v (cid:107) Y ∀ u, v ∈ K, (22)provided that H (cid:48) is continuous with respect to the weak topology on X (cid:63) . Indeed,the first step of the proof of Theorem 2.5 requires only the weaker assumption (22),and (21) will be implied by the continuity of H (cid:48) with respect to the weak topologyon X (cid:63) . Remark 2.8.
Applying the abstract analysis in [HPW20], it can be shown thatthe iterates generated by the modified Kaˇcanov scheme (15) satisfy the contractionproperty H ( u n +1 ) − H ( u (cid:63) ) ≤ q ( H ( u n ) − H ( u (cid:63) )) ∀ n ≥ , for some constant 0 < q <
1, where u (cid:63) is the solution of (7). The contractionfactor q , as derived in [HPW20], depends on the damping factor δ , and is, a priori,minimal for δ ≡ α / L H . We note, however, as the derivation of q contains some roughestimates, that this choice is typically too pessimistic for practical purposes.3. Adaptive step size control
In this section, we will present two adaptive methods for selecting the dampingparameter δ in the modified Kaˇcanov iteration (15). To this end, recall the keyinequality H ( u n ) − H ( u n +1 ) ≥ γ (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y , n ≥ , (23)cf. (20), which is crucial for the convergence of the modified Kaˇcanov scheme (15).If we set δ max := α / L H in (20), which is a possibly pessimistic choice, as mentionedin Remark 2.8, then (23) holds for γ = L H / . Alternatively, within the setting ofthe classical Kaˇcanov scheme, i.e., δ ≡
1, the bound (23) can be shown for theconstant γ = α / under more restrictive assumptions on the nonlinearity; see theproof of [Zei90, Thm. 25.L]. This observation may suggest that a smaller choice of γ potentially relates to a larger size of the damping parameter. We thus proposethat the sequence { u n } n ≥ is required to satisfy an estimate of the form H ( u n ) − H ( u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y , (24) MODIFIED KAˇCANOV METHOD for a constant 0 < θ ≤ / . In our numerical experiments in § θ = 0 . δ min := α / L H , which, in viewof Remark 2.8, is a reasonable choice.3.1. Step size control via Taylor expansion.
Suppose that F := H (cid:48) : K → Y (cid:63) is Fr´echet differentiable, and define the (continuously differentiable) function ψ n :[0 , → R by ψ n ( t ) := (cid:10) F ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) , t ∈ [0 , . Then, if the difference u n +1 − u n is sufficiently small, applying a Taylor expansionat t = 0, yields ψ n ( t ) ≈ ψ n (0) + ψ (cid:48) n (0) t = (cid:10) F ( u n ) , u n +1 − u n (cid:11) + t (cid:10) F (cid:48) ( u n )( u n +1 − u n ) , u n +1 − u n (cid:11) . Since (13) implies that u n +1 − u n = − δ ( u n ) ρ n , n ≥ , we obtain ψ n ( t ) ≈ − δ ( u n ) (cid:104) F ( u n ) , ρ n (cid:105) + tδ ( u n ) (cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) . Moreover, using (11) and incorporating (18), we note that (cid:90) ψ n ( t ) d t = (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) d t = H ( u n +1 ) − H ( u n ) . Hence, H ( u n ) − H ( u n +1 ) ≈ δ ( u n ) (cid:104) F ( u n ) , ρ n (cid:105) − δ ( u n ) (cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) , cp. (20). Then, a simple calculation reveals that the right-hand side is maximizedfor the damping parameter δ ( u n ) := (cid:104) F ( u n ) , ρ n (cid:105)(cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) . (25)In account of (24), this leads to the step size Algorithm 1. We note that thestopping criterion in line 6 will be satisfied once the damping parameter is smallenough, i.e., the procedure terminates after finitely many steps. Algorithm 1
Step size control via Taylor expansion
Input:
Given u n ∈ K , a correction factor σ ∈ (0 , θ ∈ (0 , / ]. Solve the linear problem A ( u n ) ρ n = F ( u n ) for ρ n ∈ X , cf. (12b). Define δ ( u n ) from (25). repeat Compute u n +1 := u n − δ ( u n ) ρ n , cf. (13). Set δ ( u n ) ← σδ ( u n ). until H ( u n ) − H ( u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y return u n +1 . ODIFIED KAˇCANOV METHOD 9
Step size control via a prediction-correction strategy.
We will presenta second adaptive damping parameter selection procedure that is partially based onideas from [Deu04, § F = H (cid:48) . The idea is fairly straightforward: For a given correction factor σ ∈ (0 ,
1) anddamping factor δ >
0, we compare the energy decay for the damping parameters δ and δ (cid:48) = σ p δ , where p ∈ {− , } depends on the previous step; we note that p = − p = 1 decreases the damping parameter. Ifapplying δ (cid:48) results in a larger energy decay then we choose the damping parameterto be δ (cid:48) in the present and subsequent steps, with p unchanged; otherwise, if δ outperforms δ (cid:48) , then δ is retained, however, in the next step we replace p by − p .This leads to Algorithm 2. Algorithm 2
Step size control via prediction-correction strategy
Input:
Given u n ∈ K , a damping parameter δ , an exponent p ∈ {− , } , a correc-tion factor σ ∈ (0 , θ ∈ (0 , / ]. Solve the linear problem A ( u n ) ρ n = F ( u n ) for ρ n ∈ X , cf. (12b). Set δ (cid:48) := σ p δ and compute (cid:101) u n +1 := u n − δ (cid:48) ρ n , cf. (13). if H ( u n ) − H ( (cid:101) u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y then Compute u n +1 := u n − δρ n , cf. (13). if H ( (cid:101) u n +1 ) ≤ H ( u n +1 ) then Set δ ← δ (cid:48) and u n +1 ← (cid:101) u n +1 . else Set p ← − p . end if else Set p ← δ ← δ (cid:48) , and u n +1 := (cid:101) u n +1 . while H ( u n ) − H ( u n +1 ) < θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y do Set δ ← σδ and compute u n +1 := u n − δρ n , cf. (13). end while end if return δ , u n +1 , and p .4. Application to quasilinear diffusion models
In this section, we discuss the weak formulations of the boundary value prob-lem (1) as well as of the Kaˇcanov iteration scheme (2). In addition, an equivalentvariational setting will be established. Furthermore, a series of numerical experi-ments in the framework of finite element discretizations will be presented.4.1.
Sobolev spaces.
Let Y := H (Ω) be the standard Sobolev space of L (Ω)-functions with weak derivatives in L (Ω). We endow Y with the inner product( u, v ) Y := (cid:90) Ω ∇ u · ∇ v d x + (cid:90) Ω uv d x , u, v ∈ Y, and with the induced H -norm (cid:107) u (cid:107) Y := (cid:112) ( u, u ) Y , u ∈ Y . Moreover, consider theclosed linear subspace X := { w ∈ Y : w | Γ = 0 } ⊂ Y , where w | Γ denotes thetrace of w ∈ Y on the (non-vanishing) Dirichlet boundary part Γ ⊂ ∂ Ω. We equip X with the H -seminorm (cid:107) u (cid:107) X := (cid:107)∇ u (cid:107) L (Ω) , for u ∈ X ; owing to the Poincar´e-Friedrichs inequality, we note that the norm (cid:107) · (cid:107) X is equivalent to the norm (cid:107)·(cid:107) Y on X , i.e., there exists a constant c > c (cid:107) u (cid:107) Y ≤ (cid:107) u (cid:107) X ≤ (cid:107) u (cid:107) Y for all u ∈ X . Finally, we consider the closed and convex subset K := { w ∈ Y : w | Γ = g on Γ } ⊂ Y, (26)with g the Dirichlet boundary data from (2b). Evidently, K has property (K). Inparticular, if Γ = ∂ Ω and g ≡ ∂ Ω, then we may consider Y = X = K = H (Ω)with the norm (cid:107) · (cid:107) X .4.2. Weak formulations.
For any given u ∈ K , we define a (symmetric) bilinearform a ( u ; · , · ) on Y × Y by a ( u ; v, w ) := (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ v · ∇ w d x , v ∈ Y, w ∈ X. (27)Moreover, we introduce the linear functional (cid:104) b, v (cid:105) := (cid:90) Ω f v d x − (cid:90) Γ hv d x , v ∈ X, (28)where (cid:104)· , ·(cid:105) denotes the duality pairing in X (cid:63) × X , with X (cid:63) signifying the dualspace of X . If the source function f ∈ L (Ω) and the Neumann boundary data h ∈ L ( ∂ Ω), then we notice that b ∈ X (cid:63) ; incidentally, more general assumptions onthe data can be made, see, e.g., [Zei90, Rem. 25.32].In terms of the above forms, the weak formulation of (1) reads as follows:Find u ∈ K : a ( u ; u, v ) = (cid:104) b, v (cid:105) ∀ v ∈ X. (29)Furthermore, for given u n ∈ K , n ≥
0, the weak form of the Kaˇcanov scheme (2)is to find u n +1 ∈ K such that a ( u n ; u n +1 , v ) = (cid:104) b, v (cid:105) ∀ v ∈ X. The ensuing result follows from standard arguments.
Proposition 4.1.
If the diffusion coefficient µ satisfies ( µ , then the form a ( · ; · , · ) from (27) is bounded in the sense that | a ( u ; v, w ) | ≤ M µ (cid:107) v (cid:107) Y (cid:107) w (cid:107) Y ∀ u ∈ K, ∀ v ∈ Y, ∀ w ∈ X. Moreover, there exists a constant α > such that, for any u ∈ Y , we have thecoercivity property a ( u ; v, v ) ≥ α (cid:107) v (cid:107) Y ∀ v ∈ X, and, especially, a ( u ; v − w, v − w ) ≥ α (cid:107) v − w (cid:107) Y ∀ v, w ∈ K. Variational framework.
We introduce a (nonlinear) functional G : K → R by G ( u ) := (cid:90) Ω ψ (cid:0) |∇ u | (cid:1) d x , with ψ ( s ) := 12 (cid:90) s µ ( t ) d t, s ≥ . (30)For u ∈ K , the Gˆateaux derivative of G is given by (cid:104) G (cid:48) ( u ) , v (cid:105) = (cid:90) Ω ψ (cid:48) (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x = (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x , (31)for all v ∈ X , i.e., G (cid:48) ( u ) = a ( u ; u, · ) in X (cid:63) . Then, introduce the (energy) potential H : K → R by H ( u ) := G ( u ) − (cid:104) b, u (cid:105) , (32)with G and b from (30) and (28), respectively. If the diffusion coefficient µ satisfiesthe estimates m µ ( t − s ) ≤ µ ( t ) t − µ ( s ) s ≤ M µ ( t − s ) , t ≥ s ≥ , (33) ODIFIED KAˇCANOV METHOD 11 then the Lipschitz condition (16) and the strong monotonicity property (3) can bededuced with ν = m µ and L H = 3 M µ , cf. [Zei90, Prop. 25.26]. Remark 4.2.
It is easily shown that ( µ µ
4) implies (33), however, we emphasizethat the latter assumption does not require µ to be decreasing or differentiable.The following result is a direct consequence of Corollary 2.3. Proposition 4.3.
Suppose that the diffusion coefficient µ satisfies (33) . Then, thefunctional H from (32) has a unique minimizer u (cid:63) ∈ K , cf. (26) , which satisfiesthe weak formulation (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x = (cid:90) Ω f v d x − (cid:90) Γ hv d x ∀ v ∈ X, (34) i.e., u (cid:63) is the unique (weak) solution of (29) . Convergence of the modified Kaˇcanov method.
Recall that the prop-erties (A1)–(A4) as well as (K) are satisfied if the diffusion coefficient obeys thebounds (33) by our analysis in the previous sections § § without assuming ( µ Theorem 4.4.
Assume that the diffusion coefficient satisfies the bounds (33) .Then, the damped Kaˇcanov method (15) , with δ : K → [ δ min , δ max ] and < δ min ≤ δ max < α / M µ , converges to the unique weak solution u (cid:63) ∈ K of (34) . Remark 4.5.
We emphasize that the assumptions on the damping function δ fromTheorem 4.4 are sufficient for the key inequality (20) to hold, however, they arenot necessary. Indeed, as both step size methods from § δ .4.5. Numerical experiments.
We will now perform a number of numerical testsfor the modified Kaˇcanov method based on the different step size methods from § − , \ ([0 , × [ − , R . We focus on homogeneous Dirichlet boundary conditions,i.e., Γ = ∂ Ω and g ≡
0, and therefore K = Y = X = H (Ω). The sourcefunction f in (1a) is chosen such that the analytical solution of (1a)–(1b), with g = 0, is given by the smooth function u (cid:63) ( x, y ) = sin( πx ) sin( πy ). For the numericalapproximation, we will use a conforming P -finite element framework with a uniformmesh consisting of approximately 3 · triangles. Throughout we set the correctionfactor in the adaptive step size algorithms to be σ = 0 . Monotonically decreasing diffusion.
We consider the nonlinear diffusion co-efficient µ ( t ) = ( t + 1) − + / , for t ≥
0, see Figure 1. It is straightforward to verifythat the diffusion parameter µ satisfies (33) as well as the properties ( µ µ §
3. Even though the dif-fusion parameter is monotonically decreasing and differentiable, which implies theconvergence of the classical Kaˇcanov scheme, we can see from Figure 2 that thedamped Kaˇcanov method with either the step size method from § § . . t µ µ Figure 1.
Diffusion coefficients.
Figure 2.
Experiment 4.5.1. Left: Comparison of the perfor-mance of the classical Kaˇcanov scheme (”Undamped”) with thestep size algorithms from § § Non-monotone diffusion.
In our second experiment, we consider the nonlin-ear diffusion parameter µ ( t ) = t exp( − t ) log( t + (cid:15) ) + 1, t ≥
0, for (cid:15) = 10 − , seeFigure 1. It can be shown that µ satisfies (33), however, is not monotonically (de-creasing). Even though Figure 3 indicates that the classical Kaˇcanov scheme (12)may still converge, we see that the convergence rate is rather poor. In contrast,the modified Kaˇcanov scheme (15) with either damping strategy from § Conclusion
In this work, we have devised a modified version of the classical Kaˇcanov itera-tion scheme. We have shown that the introduction of a damping parameter allowsto derive a new convergence analysis which applies to a wider class of problems.In particular, in the context of quasilinear elliptic PDE, a standard monotonicitycondition on the diffusion coefficient can be dropped. Moreover, our numerical testshighlight that the modified Kaˇcanov method, in combination with suitable step sizestrategies, outperforms the classical scheme for the examples under consideration.We close by remarking that our work can be extended in a straightforward man-ner to quasilinear systems with applications to, e.g., plasticity or quasi-Newtonianfluids.
References [AIM09] K. Astala, T. Iwaniec, and G. Martin,
Elliptic partial differential equations and qua-siconformal mappings in the plane , Princeton Mathematical Series, vol. 48, PrincetonUniversity Press, Princeton, NJ, 2009.
ODIFIED KAˇCANOV METHOD 13
Figure 3.
Experiment 4.5.2. Left: Comparison of the perfor-mance of the classical Kaˇcanov scheme (”Undamped”) with thestep size algorithms from § § [Deu04] P. Deuflhard, Newton methods for nonlinear problems , Springer Series in ComputationalMathematics, vol. 35, Springer-Verlag, Berlin, 2004, Affine invariance and adaptivealgorithms.[GMZ11] E. M. Garau, P. Morin, and C. Zuppa,
Convergence of an adaptive Kaˇcanov FEM forquasi-linear problems , Appl. Numer. Math. (2011), no. 4, 512–529.[HJS97] W. Han, S. Jensen, and I. Shimansky, The Kaˇcanov method for some nonlinear prob-lems , Appl. Numer. Meth. (1997), 57–79.[HPW20] P. Heid, D. Praetorius, and T. P. Wihler, A note on energy contraction and opti-mal convergence of adaptive iterative linearized finite element methods , Tech. Report2007.10750, arxiv.org, 2020.[HW20a] P. Heid and T. P. Wihler,
Adaptive iterative linearization Galerkin methods for non-linear problems , Math. Comp. (2020), no. 326, 2707–2734.[HW20b] , On the convergence of adaptive iterative linearized Galerkin methods , Calcolo (2020), no. 3, Paper No. 24, 23.[Kaˇc59] L. M. Kaˇcanov, Variational methods of solution of plasticity problems , J. Appl. Math.Mech. (1959), 880–883. MR 0112408[Zei88] E. Zeidler, Nonlinear functional analysis and its applications. IV , Springer-Verlag, NewYork, 1988, Applications to mathematical physics, Translated from the German andwith a preface by Juergen Quandt.[Zei90] ,