[PDF] A modified Kačanov iteration scheme with application to quasilinear diffusion models

Abstract

The classical Ka\v{c}anov scheme for the solution of nonlinear variational problems can be interpreted as a fixed point iteration method that updates a given approximation by solving a linear problem in each step. Based on this observation, we introduce a modified Ka\v{c}anov method, which allows for (adaptive) damping, and, thereby, to derive a new convergence analysis under more general assumptions and for a wider range of applications. For instance, in the specific context of quasilinear diffusion models, our new approach does no longer require a standard monotonicity condition on the nonlinear diffusion coefficient to hold. Moreover, we propose two different adaptive strategies for the practical selection of the damping parameters involved.

Full PDF

AA MODIFIED KA ˇCANOV ITERATION SCHEME WITHAPPLICATION TO QUASILINEAR DIFFUSION MODELS

PASCAL HEID AND THOMAS P. WIHLER

Abstract.

The classical Kaˇcanov scheme for the solution of nonlinear varia-tional problems can be interpreted as a ﬁxed point iteration method that up-dates a given approximation by solving a linear problem in each step. Basedon this observation, we introduce a modiﬁed Kaˇcanov method , which allows for(adaptive) damping, and, thereby, to derive a new convergence analysis undermore general assumptions and for a wider range of applications. For instance,in the speciﬁc context of quasilinear diﬀusion models, our new approach doesno longer require a standard monotonicity condition on the nonlinear diﬀusioncoeﬃcient to hold. Moreover, we propose two diﬀerent adaptive strategies forthe practical selection of the damping parameters involved. Introduction

In this article we focus on a novel iterative Kaˇcanov type procedure for thesolution of quasilinear elliptic partial diﬀerential equations (PDE) of the form − div (cid:8) µ (cid:0) |∇ u | (cid:1) ∇ u (cid:9) = f in Ω (1a) u = g on Γ (1b) − µ (cid:0) |∇ u | (cid:1) ∇ u · n = h on Γ , (1c)where Ω ⊂ R d , d ∈ { , } , is a non-empty, open, and bounded domain with aLipschitz boundary ∂ Ω. We suppose that ∂ Ω = Γ ∪ Γ is composed of a Dirichletboundary part Γ (cid:54) = ∅ (of non-zero surface measure) and a Neumann boundarypart Γ , and n denotes the unit outward normal vector on Γ . Moreover, µ is areal-valued diﬀusion coeﬃcient, and g and h are Dirichlet and Neumann boundarycondition functions, respectively. Nonlinear equations of this type are widely usedin mathematical models of physical applications including, for instance, hydro-and gas-dynamics, as well as elasticity and plasticity, see, e.g., [HJS97] and thereferences therein; we further refer to [Zei88, § § u with u = g on Γ , the traditional Kaˇcanov schemefor the solution of (1) is given by − div (cid:8) µ (cid:0) |∇ u n | (cid:1) ∇ u n +1 (cid:9) = f in Ω (2a) u n +1 = g on Γ (2b) − µ (cid:0) |∇ u n | (cid:1) ∇ u n +1 · n = h on Γ ; (2c) Mathematical Institute, University of Oxford, Oxford, OX2 6GG, UKMathematics Institute, University of Bern, CH-3012 Bern, Switzerland

E-mail addresses : [email protected], [email protected] .2010 Mathematics Subject Classiﬁcation.

Key words and phrases.

Quasilinear elliptic PDE, strongly monotone problems, ﬁxed pointiterations, Kaˇcanov method, quasi-Newtonian ﬂuids, shear-thickening ﬂuids.The authors acknowledge the ﬁnancial support of the Swiss National Science Foundation (SNF),Grant No. 200021 182524, and Project No. P2BEP2 191760. a r X i v : . [ m a t h . NA ] J a n MODIFIED KAˇCANOV METHOD this iteration scheme was originally introduced by Kaˇcanov in [Kaˇc59], in the con-text of variational methods for plasticity problems. Observing that the aboveboundary value problem is a linear PDE for u n +1 (given u n ), we can see Kaˇcanov’sscheme as an iterative linearization method . In the literature, standard assump-tions on the nonlinearity µ , which guarantee the convergence of (2), are expressedas follows:( µ

1) The diﬀusion function µ : [0 , ∞ ) → [0 , ∞ ) is continuously diﬀerentiable;( µ

2) The diﬀusion function µ is decreasing, i.e., µ (cid:48) ( t ) ≤ t ≥ µ

3) There are positive constants m µ and M µ such that m µ ≤ µ ( t ) ≤ M µ for all t ≥ µ

4) There exists a positive constant c µ > µ (cid:48) ( t ) t + µ ( t ) ≥ c µ forall t ≥ µ µ Novelty of the paper.

In this work we address the open question of whether themonotonicity assumption ( µ

2) is necessary or not for the convergence of the Kaˇcanovscheme. Indeed, numerical experiments in [GMZ11, HW20a] indicate that it maybe dropped. Based on this observation, in this work, we will introduce a mod-iﬁed Kaˇcanov iteration method that converges without imposing condition ( µ = ∅ , upon deﬁning the PDE residual F := − div (cid:8) µ (cid:0) |∇ ( · ) | (cid:1) ∇ ( · ) (cid:9) − f, and, for given u , the linear preconditioning operator A ( u )( · ) := − div (cid:8) µ (cid:0) |∇ u | (cid:1) ∇ ( · ) (cid:9) , the iterative procedure (2) can be written (formally) in terms of the ﬁxed pointiteration u n +1 = u n − A ( u n ) − F ( u n ) , n ≥ . With the aim of obtaining an improved control of the updates in each step, weintroduce a step size parameter δ ( u n ) > u n +1 = u n − δ ( u n ) A ( u n ) − F ( u n ) , n ≥ . This yields the modiﬁed Kaˇcanov method proposed and analyzed in this work.

Outline.

We begin by deriving an appropriate framework for abstract nonlinearvariational problems in §

2. In particular, we introduce a modiﬁed version of theclassical Kaˇcanov iteration scheme, and prove a new convergence result under as-sumptions that are milder than in the classical setting. The purpose of § §

4, which also contains a numerical study within the frameworkof ﬁnite element discretizations. Finally, we add some concluding remarks in § Abstract analysis

Throughout, Y is a reﬂexive real Banach space, equipped with a norm denotedby (cid:107) · (cid:107) Y , and K ⊂ Y is a closed, convex subset. ODIFIED KAˇCANOV METHOD 3

Nonlinear variational problems.

Consider a (nonlinear) Gˆateaux contin-uously diﬀerentiable functional H : K → R that has a strongly monotone Gˆateauxderivative, i.e., there exists a constant ν > (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , u − v (cid:105) ≥ ν (cid:107) u − v (cid:107) Y ∀ u, v ∈ K, (3)where (cid:104)· , ·(cid:105) is the duality product on Y (cid:63) × Y . Proposition 2.1.

Suppose that H : K → R is a (Gˆateaux) continuously diﬀeren-tiable functional that satisﬁes the strong monotonicity condition (3) . Then, thereexists a unique minimizer u (cid:63) ∈ K of H in K , i.e., H ( u (cid:63) ) ≤ H ( v ) for all v ∈ K .Furthermore, u (cid:63) ∈ K is the unique solution of the weak inequality (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105) ≥ ∀ v ∈ K. (4) Proof.

We follow along the lines of the proof of [Zei90, Thm. 25.L].1. Since H is a Gˆateaux continuously diﬀerentiable functional on K , it is, in par-ticular, continuous on K . Moreover, from (3) we infer that H is strictly convex,see, e.g., [Zei90, Prop. 25.10]. These two properties, in turn, imply that H isweakly sequentially lower semicontinuous, see [Zei90, Prop. 25.20].2. If the set K is bounded, then the functional H , being weakly sequentially lowersemicontinuous, has a minimum on K , see [Zei90, Thm. 25.C]. Otherwise, if K is unbounded, then we show that H is weakly coercive. To this end, take any u, v ∈ K , and deﬁne the scalar function ϕ ( t ) := H ( u + t ( v − u )) , t ∈ [0 , K is convex, note that u + t ( v − u ) ∈ K for all t ∈ [0 , ϕ (cid:48) ( t ) = (cid:104) H (cid:48) ( u + t ( v − u )) , v − u (cid:105) . (6)Thus, by virtue of the fundamental theorem of calculus, we deduce that H ( v ) − H ( u ) = (cid:90) (cid:104) H (cid:48) ( u + t ( v − u )) , v − u ) (cid:105) d t = (cid:90) (cid:104) H (cid:48) ( u + t ( v − u )) − H (cid:48) ( u ) , v − u (cid:105) d t + (cid:104) H (cid:48) ( u ) , v − u (cid:105) . Therefore, exploiting (3), it follows that H ( v ) − H ( u ) ≥ ν (cid:107) v − u (cid:107) Y − (cid:107) H (cid:48) ( u ) (cid:107) Y (cid:63) (cid:107) v − u (cid:107) Y . Hence, we see that H ( v ) → ∞ for (cid:107) v (cid:107) Y → ∞ , i.e., H is weakly coercive on K .Then, owing to [Zei90, Thm. 25.D], we conclude that H has a minimum u (cid:63) in K . We note that this minimum is unique since H is strictly convex.3. Finally, if u (cid:63) is the minimum of H in K , then the function ϕ ( t ) from (5), with u = u (cid:63) , has a minimum at t = 0. This implies that ϕ (cid:48) (0) ≥

0. In turn,exploiting (6), this holds true if and only if (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105) ≥

0, which yields (4).Conversely, since H (cid:48) is strongly monotone, cf. (3), and u (cid:63) satisﬁes the weakinequality (4), the function ϕ ( t ) is increasing on [0 , t ∈ (0 , ϕ (cid:48) ( t ) = 1 t (cid:104) H (cid:48) ( u (cid:63) + t ( v − u (cid:63) )) − H (cid:48) ( u (cid:63) ) , t ( v − u (cid:63) ) (cid:105) + (cid:104) H (cid:48) ( u (cid:63) ) , v − u (cid:63) (cid:105)≥ t (cid:104) H (cid:48) ( u (cid:63) + t ( v − u (cid:63) )) − H (cid:48) ( u (cid:63) ) , t ( v − u (cid:63) ) (cid:105) ≥ νt (cid:107) v − u (cid:63) (cid:107) Y ≥ . Therefore, we obtain H ( v ) = ϕ (1) ≥ ϕ (0) = H ( u (cid:63) ) , i.e., u (cid:63) is the minimum of H in K since v was arbitrary.This completes the proof. (cid:3) MODIFIED KAˇCANOV METHOD

Remark 2.2.

We emphasize that the application of [Zei90, Thms. 25.C & 25.D]in the proof of Proposition 2.1 require the space Y to be reﬂexive. More precisely,these results make use of the fact that every bounded sequence in a reﬂexive Banachspace has a weakly convergent subsequence.Consider the following assumption on the closed and convex subset K :(K) The set X := { u − v : u, v ∈ K } is a linear closed subspace of Y , and x + y ∈ K for all x ∈ X and y ∈ K . Corollary 2.3.

Suppose that the subset K has property (K) , and let the assump-tions of Proposition 2.1 hold true. Then, the unique minimizer u (cid:63) ∈ K of H satisﬁes (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) = 0 ∀ v ∈ X. (7) Proof.

Let u (cid:63) ∈ K be the unique minimizer of H on K . Then, for any v ∈ X ,owing to property (K), it holds that v + u (cid:63) ∈ K . Thus, using (4), we have 0 ≤(cid:104) H (cid:48) ( u (cid:63) ) , ( v + u (cid:63) ) − u (cid:63) (cid:105) = (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) . Similarly, upon replacing v by − v , we inferthat 0 ≤ − (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) , which concludes the argument. (cid:3) Modiﬁed Kaˇcanov method.

We consider mappings a : K × Y × X → R and b : K × X → R , with X the linear subspace from property (K) above, whichsatisfy the following properties:(A1) For any given u ∈ K , we suppose that a ( u ; · , · ) is a bilinear form on Y × X ,and b ( u, · ) ∈ X (cid:63) ; in the sequel, we use the notation b ( u, · ) = (cid:104) b ( u ) , ·(cid:105) , wherethe dual product is evaluated on the space X (cid:63) × X .(A2) There exist positive constants α, β > u ∈ K , the form a ( u ; · , · ) is uniformly bounded on Y × X and coercive on X × X in the sensethat a ( u ; v, w ) ≤ β (cid:107) v (cid:107) Y (cid:107) w (cid:107) Y ∀ v ∈ Y, ∀ w ∈ X, (8)and a ( u ; v, v ) ≥ α (cid:107) v (cid:107) Y ∀ v ∈ X, respectively; in particular, if the set K satisﬁes property (K), then it followsthat a ( u ; v − w, v − w ) ≥ α (cid:107) v − w (cid:107) Y ∀ v, w ∈ K. (9)(A3) There are Gˆateaux continuously diﬀerentiable functionals G : K → R and B : K → R such that, for all u ∈ K , it holds G (cid:48) ( u ) | X = a ( u ; u, · ) and B (cid:48) ( u ) | X = b ( u ) in X (cid:63) .(A4) The (continuously diﬀerentiable) functional H : K → R given by H ( u ) := G ( u ) − B ( u ), u ∈ K , with G and B from (A3), satisﬁes the strong monotonicitycondition (3).If the closed and convex subset K ⊂ Y fulﬁls property (K), then the uniqueminimizer u (cid:63) ∈ K of the functional H from (A4) solves the weak formulation0 = (cid:104) H (cid:48) ( u (cid:63) ) , v (cid:105) = (cid:104) G (cid:48) ( u (cid:63) ) − B (cid:48) ( u (cid:63) ) , v (cid:105) = a ( u (cid:63) ; u (cid:63) , v ) − (cid:104) b ( u (cid:63) ) , v (cid:105) ∀ v ∈ X, (10)cf. Corollary 2.3. Now, for given u ∈ K , deﬁne the linear operator A ( u ) : Y → X (cid:63) , v (cid:55)→ A ( u ) v , by (cid:104) A ( u ) v, w (cid:105) = a ( u ; v, w ) ∀ w ∈ X. Then, the weak formulation (10) can be expressed by A ( u (cid:63) ) u (cid:63) = b ( u (cid:63) ) in X (cid:63) . In light of (A2), for any u ∈ K , we notice that a ( u ; · , · ) is a bounded and coercivebilinear form on the closed subspace X . In particular, due to the Lax-Milgramtheorem, for any (cid:96) ∈ X (cid:63) , we conclude that there exists a unique w (cid:96) ∈ X such that ODIFIED KAˇCANOV METHOD 5 A ( u ) w (cid:96) = (cid:96) in X (cid:63) , i.e., A ( u ) | X : X → X (cid:63) is invertible for any u ∈ K . Hence,noticing that F ( u ) := H (cid:48) ( u ) = A ( u ) u − b ( u ) ∈ X (cid:63) , (11)the classical Kaˇcanov method in abstract form, for given u n ∈ K , is given by u n +1 = u n − ρ n , n ≥ , (12a)where ρ n ∈ X is uniquely deﬁned through A ( u n ) ρ n = F ( u n ) in X (cid:63) . (12b)A modiﬁcation of this procedure is obtained by invoking a parameter δ : K → (0 , ∞ ), thereby yielding the new scheme u n +1 = u n − δ ( u n ) ρ n , n ≥ , (13)with ρ n as in (12b). Equivalently, upon introducing the forms a δ ( u ; v, w ) := 1 δ ( u ) a ( u ; v, w ) , b δ ( u ) := 1 δ ( u ) a ( u ; u, · ) − F ( u ) , (14)for u ∈ K, v ∈ Y, and w ∈ X , we derive the modiﬁed Kaˇcanov iteration in weakform: u n +1 ∈ K : a δ ( u n ; u n +1 , v ) = (cid:104) b δ ( u n ) , v (cid:105) ∀ v ∈ X, n ≥ . (15)Clearly, for δ = 1, the traditional Kaˇcanov scheme (12) is recovered. Proposition 2.4.

Suppose that K satisﬁes property (K) . Then, for any initialguess u ∈ K , the modiﬁed Kaˇcanov iteration (15) remains well-deﬁned for each n ≥ , i.e., for given u n ∈ K , the solution u n +1 ∈ K of the weak formulation (15) exists and is unique.Proof. For ﬁxed u n ∈ K , the solution ρ n ∈ X of (12b) exists and is unique.Moreover, owing to property (K), we infer that u n +1 ∈ K in (13). (cid:3) Convergence analysis.

We are now in the position to state and prove themain result of our work.

Theorem 2.5.

Given the assumptions of Proposition 2.4. Moreover, suppose that H (cid:48) is Lipschitz continuous in the sense that there exists a constant L H > suchthat (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , w (cid:105) ≤ L H (cid:107) u − v (cid:107) Y (cid:107) w (cid:107) Y ∀ u, v ∈ K, w ∈ X. (16) If δ : K → [ δ min , δ max ] , with < δ min ≤ δ max < α / L H , then the damped Kaˇcanoviteration (13) converges to the unique solution u (cid:63) ∈ K of (7) for any initial guess u ∈ K . Remark 2.6.

The key observation in our work is that the introduction of a damp-ing factor in the modiﬁed scheme (13) allows to weaken the assumptions in the clas-sical convergence theorem for the Kaˇcanov method, see, e.g., [Zei90, Thm. 25.L].Speciﬁcally, the inequality G ( u ) − G ( v ) ≥

12 ( a ( u ; u, u ) − a ( u ; v, v )) ∀ u, v ∈ K, (17)cf. [Zei90, Eq. (106)], which plays a crucial role in the standard analysis, can bedropped; instead, the proof of Theorem 2.5 hinges essentially on the bound (20)below. Furthermore, in contrast to the traditional framework, the operator B fromassumption (A3) need not to be linear in our analysis. We remark that theseimprovements come at the expense of condition (K) as well as of the Lipschitzcontinuity condition (16); with regards to the application to quasilinear ellipticPDE (1), however, these assumptions do not implicate any drawback. MODIFIED KAˇCANOV METHOD

Our proof of Theorem 2.5 follows to some extent the lines of the proof of [Zei90,Thm. 25.L], however, with some major modiﬁcations due to the absence of the keyinequality (17) .

Proof of Theorem 2.5.

We proceed in several steps.1. We show that (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y vanishes as n → ∞ . To this end, as in the proofof Proposition 2.1, we deﬁne the scalar function ϕ ( t ) := H ( u n + t ( u n +1 − u n )),for t ∈ [0 ,

1] and n ≥

0. Then, we ﬁnd that H ( u n +1 ) − H ( u n ) = (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) d t (18)= (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) − H (cid:48) ( u n ) , u n +1 − u n (cid:11) d t + (cid:10) H (cid:48) ( u n ) , u n +1 − u n (cid:11) . Combining (11) and (14), we note that H (cid:48) ( u ) = F ( u ) = a δ ( u ; u, · ) − b δ ( u ) ∀ u ∈ K (19)in X (cid:63) . Hence, by invoking the Lipschitz continuity (16), and upon exploitingthe modiﬁed Kaˇcanov scheme (15), we obtain H ( u n +1 ) − H ( u n ) ≤ L H (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y + a δ ( u n ; u n , u n +1 − u n ) − (cid:10) b δ ( u n ) , u n +1 − u n (cid:11) = L H (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y − a δ ( u n ; u n +1 − u n , u n +1 − u n ) . Furthermore, employing the coercivity assumption (9), it follows that H ( u n +1 ) − H ( u n ) ≤ (cid:18) L H − αδ ( u n ) (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y . Thus, since δ ( u n ) ≤ δ max < α / L H , we deduce the bound H ( u n ) − H ( u n +1 ) ≥ (cid:18) αδ max − L H (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y > , (20)for u n +1 (cid:54) = u n . In particular, we infer that { H ( u n ) } n ≥ is a decreasing sequencewhich is bounded from below by H ( u (cid:63) ), and thus converges. Therefore,0 ≤ (cid:18) αδ max − L H (cid:19) (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y ≤ (cid:0) H ( u n ) − H ( u n +1 ) (cid:1) → , as n → ∞ , i.e., lim n →∞ (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y = 0.2. We show that { u n } n ≥ converges to u (cid:63) . For this purpose, we will follow closelythe proof of [HW20a, Prop. 2.1]. By the strong monotonicity (3), for any m ≥ n ≥

0, it holds that ν (cid:107) u m − u n (cid:107) Y ≤ (cid:104) H (cid:48) ( u m ) − H (cid:48) ( u n ) , (cid:15) m,n (cid:105) , with (cid:15) m,n := u m − u n ∈ X . Invoking (19) and using (15), this leads to ν (cid:107) (cid:15) m,n (cid:107) Y ≤ a δ ( u m ; u m , (cid:15) m,n ) − (cid:104) b δ ( u m ) , (cid:15) m,n (cid:105)− a δ ( u n ; u n , (cid:15) m,n ) + (cid:104) b δ ( u n ) , (cid:15) m,n (cid:105)≤ a δ ( u m ; u m − u m +1 , (cid:15) m,n ) − a δ ( u n ; u n − u n +1 , (cid:15) m,n ) . Applying (8) yields ν (cid:107) (cid:15) m,n (cid:107) Y ≤ βδ min (cid:107) (cid:15) m,n (cid:107) (cid:0)(cid:13)(cid:13) u m − u m +1 (cid:13)(cid:13) Y + (cid:13)(cid:13) u n − u n +1 (cid:13)(cid:13) Y (cid:1) , ODIFIED KAˇCANOV METHOD 7 and thus (cid:107) (cid:15) m,n (cid:107) Y ≤ βνδ min (cid:0)(cid:13)(cid:13) u m − u m +1 (cid:13)(cid:13) Y + (cid:13)(cid:13) u n − u n +1 (cid:13)(cid:13) Y (cid:1) . From the ﬁrst step of the proof, we conclude that { u n } n ≥ is a Cauchy sequencein K . Since K is a closed subset of a Banach space, the sequence { u n } n ≥ hassome limit u ∈ K .3. We show that u is a solution of (7), which is unique thanks to Proposition 2.1and Corollary 2.3. Indeed, from (19) and (15), for all w ∈ X , it follows that a δ ( u n ; u n +1 − u n , w ) + (cid:104) H (cid:48) ( u n ) , w (cid:105) = a δ ( u n ; u n +1 , w ) − (cid:104) b δ ( u n ) , w (cid:105) = 0 . Recalling that { u n +1 − u n } n ≥ is a vanishing sequence in X , and exploiting (8),we have a δ ( u n ; u n +1 − u n , w ) → n → ∞ , for any w ∈ X . Moreover, by theLipschitz continuity (16), we obtain (cid:104) H (cid:48) ( u ) , w (cid:105) = lim n → (cid:104) H (cid:48) ( u n ) , w (cid:105) = 0 ∀ w ∈ X, (21)i.e., u = u (cid:63) is the unique solution of (7).This completes the argument. (cid:3) Remark 2.7.

We can weaken the Lipschitz continuity assumption (16) in Theo-rem 2.5 by the condition (cid:104) H (cid:48) ( u ) − H (cid:48) ( v ) , u − v (cid:105) ≤ L H (cid:107) u − v (cid:107) Y ∀ u, v ∈ K, (22)provided that H (cid:48) is continuous with respect to the weak topology on X (cid:63) . Indeed,the ﬁrst step of the proof of Theorem 2.5 requires only the weaker assumption (22),and (21) will be implied by the continuity of H (cid:48) with respect to the weak topologyon X (cid:63) . Remark 2.8.

Applying the abstract analysis in [HPW20], it can be shown thatthe iterates generated by the modiﬁed Kaˇcanov scheme (15) satisfy the contractionproperty H ( u n +1 ) − H ( u (cid:63) ) ≤ q ( H ( u n ) − H ( u (cid:63) )) ∀ n ≥ , for some constant 0 < q <

1, where u (cid:63) is the solution of (7). The contractionfactor q , as derived in [HPW20], depends on the damping factor δ , and is, a priori,minimal for δ ≡ α / L H . We note, however, as the derivation of q contains some roughestimates, that this choice is typically too pessimistic for practical purposes.3. Adaptive step size control

In this section, we will present two adaptive methods for selecting the dampingparameter δ in the modiﬁed Kaˇcanov iteration (15). To this end, recall the keyinequality H ( u n ) − H ( u n +1 ) ≥ γ (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y , n ≥ , (23)cf. (20), which is crucial for the convergence of the modiﬁed Kaˇcanov scheme (15).If we set δ max := α / L H in (20), which is a possibly pessimistic choice, as mentionedin Remark 2.8, then (23) holds for γ = L H / . Alternatively, within the setting ofthe classical Kaˇcanov scheme, i.e., δ ≡

1, the bound (23) can be shown for theconstant γ = α / under more restrictive assumptions on the nonlinearity; see theproof of [Zei90, Thm. 25.L]. This observation may suggest that a smaller choice of γ potentially relates to a larger size of the damping parameter. We thus proposethat the sequence { u n } n ≥ is required to satisfy an estimate of the form H ( u n ) − H ( u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y , (24) MODIFIED KAˇCANOV METHOD for a constant 0 < θ ≤ / . In our numerical experiments in § θ = 0 . δ min := α / L H , which, in viewof Remark 2.8, is a reasonable choice.3.1. Step size control via Taylor expansion.

Suppose that F := H (cid:48) : K → Y (cid:63) is Fr´echet diﬀerentiable, and deﬁne the (continuously diﬀerentiable) function ψ n :[0 , → R by ψ n ( t ) := (cid:10) F ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) , t ∈ [0 , . Then, if the diﬀerence u n +1 − u n is suﬃciently small, applying a Taylor expansionat t = 0, yields ψ n ( t ) ≈ ψ n (0) + ψ (cid:48) n (0) t = (cid:10) F ( u n ) , u n +1 − u n (cid:11) + t (cid:10) F (cid:48) ( u n )( u n +1 − u n ) , u n +1 − u n (cid:11) . Since (13) implies that u n +1 − u n = − δ ( u n ) ρ n , n ≥ , we obtain ψ n ( t ) ≈ − δ ( u n ) (cid:104) F ( u n ) , ρ n (cid:105) + tδ ( u n ) (cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) . Moreover, using (11) and incorporating (18), we note that (cid:90) ψ n ( t ) d t = (cid:90) (cid:10) H (cid:48) ( u n + t ( u n +1 − u n )) , u n +1 − u n (cid:11) d t = H ( u n +1 ) − H ( u n ) . Hence, H ( u n ) − H ( u n +1 ) ≈ δ ( u n ) (cid:104) F ( u n ) , ρ n (cid:105) − δ ( u n ) (cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) , cp. (20). Then, a simple calculation reveals that the right-hand side is maximizedfor the damping parameter δ ( u n ) := (cid:104) F ( u n ) , ρ n (cid:105)(cid:104) F (cid:48) ( u n ) ρ n , ρ n (cid:105) . (25)In account of (24), this leads to the step size Algorithm 1. We note that thestopping criterion in line 6 will be satisﬁed once the damping parameter is smallenough, i.e., the procedure terminates after ﬁnitely many steps. Algorithm 1

Step size control via Taylor expansion

Input:

Given u n ∈ K , a correction factor σ ∈ (0 , θ ∈ (0 , / ]. Solve the linear problem A ( u n ) ρ n = F ( u n ) for ρ n ∈ X , cf. (12b). Deﬁne δ ( u n ) from (25). repeat Compute u n +1 := u n − δ ( u n ) ρ n , cf. (13). Set δ ( u n ) ← σδ ( u n ). until H ( u n ) − H ( u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y return u n +1 . ODIFIED KAˇCANOV METHOD 9

Step size control via a prediction-correction strategy.

We will presenta second adaptive damping parameter selection procedure that is partially based onideas from [Deu04, § F = H (cid:48) . The idea is fairly straightforward: For a given correction factor σ ∈ (0 ,

1) anddamping factor δ >

0, we compare the energy decay for the damping parameters δ and δ (cid:48) = σ p δ , where p ∈ {− , } depends on the previous step; we note that p = − p = 1 decreases the damping parameter. Ifapplying δ (cid:48) results in a larger energy decay then we choose the damping parameterto be δ (cid:48) in the present and subsequent steps, with p unchanged; otherwise, if δ outperforms δ (cid:48) , then δ is retained, however, in the next step we replace p by − p .This leads to Algorithm 2. Algorithm 2

Step size control via prediction-correction strategy

Input:

Given u n ∈ K , a damping parameter δ , an exponent p ∈ {− , } , a correc-tion factor σ ∈ (0 , θ ∈ (0 , / ]. Solve the linear problem A ( u n ) ρ n = F ( u n ) for ρ n ∈ X , cf. (12b). Set δ (cid:48) := σ p δ and compute (cid:101) u n +1 := u n − δ (cid:48) ρ n , cf. (13). if H ( u n ) − H ( (cid:101) u n +1 ) ≥ θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y then Compute u n +1 := u n − δρ n , cf. (13). if H ( (cid:101) u n +1 ) ≤ H ( u n +1 ) then Set δ ← δ (cid:48) and u n +1 ← (cid:101) u n +1 . else Set p ← − p . end if else Set p ← δ ← δ (cid:48) , and u n +1 := (cid:101) u n +1 . while H ( u n ) − H ( u n +1 ) < θ min { α, L H } (cid:13)(cid:13) u n +1 − u n (cid:13)(cid:13) Y do Set δ ← σδ and compute u n +1 := u n − δρ n , cf. (13). end while end if return δ , u n +1 , and p .4. Application to quasilinear diffusion models

In this section, we discuss the weak formulations of the boundary value prob-lem (1) as well as of the Kaˇcanov iteration scheme (2). In addition, an equivalentvariational setting will be established. Furthermore, a series of numerical experi-ments in the framework of ﬁnite element discretizations will be presented.4.1.

Sobolev spaces.

Let Y := H (Ω) be the standard Sobolev space of L (Ω)-functions with weak derivatives in L (Ω). We endow Y with the inner product( u, v ) Y := (cid:90) Ω ∇ u · ∇ v d x + (cid:90) Ω uv d x , u, v ∈ Y, and with the induced H -norm (cid:107) u (cid:107) Y := (cid:112) ( u, u ) Y , u ∈ Y . Moreover, consider theclosed linear subspace X := { w ∈ Y : w | Γ = 0 } ⊂ Y , where w | Γ denotes thetrace of w ∈ Y on the (non-vanishing) Dirichlet boundary part Γ ⊂ ∂ Ω. We equip X with the H -seminorm (cid:107) u (cid:107) X := (cid:107)∇ u (cid:107) L (Ω) , for u ∈ X ; owing to the Poincar´e-Friedrichs inequality, we note that the norm (cid:107) · (cid:107) X is equivalent to the norm (cid:107)·(cid:107) Y on X , i.e., there exists a constant c > c (cid:107) u (cid:107) Y ≤ (cid:107) u (cid:107) X ≤ (cid:107) u (cid:107) Y for all u ∈ X . Finally, we consider the closed and convex subset K := { w ∈ Y : w | Γ = g on Γ } ⊂ Y, (26)with g the Dirichlet boundary data from (2b). Evidently, K has property (K). Inparticular, if Γ = ∂ Ω and g ≡ ∂ Ω, then we may consider Y = X = K = H (Ω)with the norm (cid:107) · (cid:107) X .4.2. Weak formulations.

For any given u ∈ K , we deﬁne a (symmetric) bilinearform a ( u ; · , · ) on Y × Y by a ( u ; v, w ) := (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ v · ∇ w d x , v ∈ Y, w ∈ X. (27)Moreover, we introduce the linear functional (cid:104) b, v (cid:105) := (cid:90) Ω f v d x − (cid:90) Γ hv d x , v ∈ X, (28)where (cid:104)· , ·(cid:105) denotes the duality pairing in X (cid:63) × X , with X (cid:63) signifying the dualspace of X . If the source function f ∈ L (Ω) and the Neumann boundary data h ∈ L ( ∂ Ω), then we notice that b ∈ X (cid:63) ; incidentally, more general assumptions onthe data can be made, see, e.g., [Zei90, Rem. 25.32].In terms of the above forms, the weak formulation of (1) reads as follows:Find u ∈ K : a ( u ; u, v ) = (cid:104) b, v (cid:105) ∀ v ∈ X. (29)Furthermore, for given u n ∈ K , n ≥

0, the weak form of the Kaˇcanov scheme (2)is to ﬁnd u n +1 ∈ K such that a ( u n ; u n +1 , v ) = (cid:104) b, v (cid:105) ∀ v ∈ X. The ensuing result follows from standard arguments.

Proposition 4.1.

If the diﬀusion coeﬃcient µ satisﬁes ( µ , then the form a ( · ; · , · ) from (27) is bounded in the sense that | a ( u ; v, w ) | ≤ M µ (cid:107) v (cid:107) Y (cid:107) w (cid:107) Y ∀ u ∈ K, ∀ v ∈ Y, ∀ w ∈ X. Moreover, there exists a constant α > such that, for any u ∈ Y , we have thecoercivity property a ( u ; v, v ) ≥ α (cid:107) v (cid:107) Y ∀ v ∈ X, and, especially, a ( u ; v − w, v − w ) ≥ α (cid:107) v − w (cid:107) Y ∀ v, w ∈ K. Variational framework.

We introduce a (nonlinear) functional G : K → R by G ( u ) := (cid:90) Ω ψ (cid:0) |∇ u | (cid:1) d x , with ψ ( s ) := 12 (cid:90) s µ ( t ) d t, s ≥ . (30)For u ∈ K , the Gˆateaux derivative of G is given by (cid:104) G (cid:48) ( u ) , v (cid:105) = (cid:90) Ω ψ (cid:48) (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x = (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x , (31)for all v ∈ X , i.e., G (cid:48) ( u ) = a ( u ; u, · ) in X (cid:63) . Then, introduce the (energy) potential H : K → R by H ( u ) := G ( u ) − (cid:104) b, u (cid:105) , (32)with G and b from (30) and (28), respectively. If the diﬀusion coeﬃcient µ satisﬁesthe estimates m µ ( t − s ) ≤ µ ( t ) t − µ ( s ) s ≤ M µ ( t − s ) , t ≥ s ≥ , (33) ODIFIED KAˇCANOV METHOD 11 then the Lipschitz condition (16) and the strong monotonicity property (3) can bededuced with ν = m µ and L H = 3 M µ , cf. [Zei90, Prop. 25.26]. Remark 4.2.

It is easily shown that ( µ µ

4) implies (33), however, we emphasizethat the latter assumption does not require µ to be decreasing or diﬀerentiable.The following result is a direct consequence of Corollary 2.3. Proposition 4.3.

Suppose that the diﬀusion coeﬃcient µ satisﬁes (33) . Then, thefunctional H from (32) has a unique minimizer u (cid:63) ∈ K , cf. (26) , which satisﬁesthe weak formulation (cid:90) Ω µ (cid:0) |∇ u | (cid:1) ∇ u · ∇ v d x = (cid:90) Ω f v d x − (cid:90) Γ hv d x ∀ v ∈ X, (34) i.e., u (cid:63) is the unique (weak) solution of (29) . Convergence of the modiﬁed Kaˇcanov method.

Recall that the prop-erties (A1)–(A4) as well as (K) are satisﬁed if the diﬀusion coeﬃcient obeys thebounds (33) by our analysis in the previous sections § § without assuming ( µ Theorem 4.4.

Assume that the diﬀusion coeﬃcient satisﬁes the bounds (33) .Then, the damped Kaˇcanov method (15) , with δ : K → [ δ min , δ max ] and < δ min ≤ δ max < α / M µ , converges to the unique weak solution u (cid:63) ∈ K of (34) . Remark 4.5.

We emphasize that the assumptions on the damping function δ fromTheorem 4.4 are suﬃcient for the key inequality (20) to hold, however, they arenot necessary. Indeed, as both step size methods from § δ .4.5. Numerical experiments.

We will now perform a number of numerical testsfor the modiﬁed Kaˇcanov method based on the diﬀerent step size methods from § − , \ ([0 , × [ − , R . We focus on homogeneous Dirichlet boundary conditions,i.e., Γ = ∂ Ω and g ≡

0, and therefore K = Y = X = H (Ω). The sourcefunction f in (1a) is chosen such that the analytical solution of (1a)–(1b), with g = 0, is given by the smooth function u (cid:63) ( x, y ) = sin( πx ) sin( πy ). For the numericalapproximation, we will use a conforming P -ﬁnite element framework with a uniformmesh consisting of approximately 3 · triangles. Throughout we set the correctionfactor in the adaptive step size algorithms to be σ = 0 . Monotonically decreasing diﬀusion.

We consider the nonlinear diﬀusion co-eﬃcient µ ( t ) = ( t + 1) − + / , for t ≥

0, see Figure 1. It is straightforward to verifythat the diﬀusion parameter µ satisﬁes (33) as well as the properties ( µ µ §

3. Even though the dif-fusion parameter is monotonically decreasing and diﬀerentiable, which implies theconvergence of the classical Kaˇcanov scheme, we can see from Figure 2 that thedamped Kaˇcanov method with either the step size method from § § . . t µ µ Figure 1.

Diﬀusion coeﬃcients.

Figure 2.

Experiment 4.5.1. Left: Comparison of the perfor-mance of the classical Kaˇcanov scheme (”Undamped”) with thestep size algorithms from § § Non-monotone diﬀusion.

In our second experiment, we consider the nonlin-ear diﬀusion parameter µ ( t ) = t exp( − t ) log( t + (cid:15) ) + 1, t ≥

0, for (cid:15) = 10 − , seeFigure 1. It can be shown that µ satisﬁes (33), however, is not monotonically (de-creasing). Even though Figure 3 indicates that the classical Kaˇcanov scheme (12)may still converge, we see that the convergence rate is rather poor. In contrast,the modiﬁed Kaˇcanov scheme (15) with either damping strategy from § Conclusion

In this work, we have devised a modiﬁed version of the classical Kaˇcanov itera-tion scheme. We have shown that the introduction of a damping parameter allowsto derive a new convergence analysis which applies to a wider class of problems.In particular, in the context of quasilinear elliptic PDE, a standard monotonicitycondition on the diﬀusion coeﬃcient can be dropped. Moreover, our numerical testshighlight that the modiﬁed Kaˇcanov method, in combination with suitable step sizestrategies, outperforms the classical scheme for the examples under consideration.We close by remarking that our work can be extended in a straightforward man-ner to quasilinear systems with applications to, e.g., plasticity or quasi-Newtonianﬂuids.

References [AIM09] K. Astala, T. Iwaniec, and G. Martin,

Elliptic partial diﬀerential equations and qua-siconformal mappings in the plane , Princeton Mathematical Series, vol. 48, PrincetonUniversity Press, Princeton, NJ, 2009.

ODIFIED KAˇCANOV METHOD 13

Figure 3.

Experiment 4.5.2. Left: Comparison of the perfor-mance of the classical Kaˇcanov scheme (”Undamped”) with thestep size algorithms from § § [Deu04] P. Deuﬂhard, Newton methods for nonlinear problems , Springer Series in ComputationalMathematics, vol. 35, Springer-Verlag, Berlin, 2004, Aﬃne invariance and adaptivealgorithms.[GMZ11] E. M. Garau, P. Morin, and C. Zuppa,

Convergence of an adaptive Kaˇcanov FEM forquasi-linear problems , Appl. Numer. Math. (2011), no. 4, 512–529.[HJS97] W. Han, S. Jensen, and I. Shimansky, The Kaˇcanov method for some nonlinear prob-lems , Appl. Numer. Meth. (1997), 57–79.[HPW20] P. Heid, D. Praetorius, and T. P. Wihler, A note on energy contraction and opti-mal convergence of adaptive iterative linearized ﬁnite element methods , Tech. Report2007.10750, arxiv.org, 2020.[HW20a] P. Heid and T. P. Wihler,

Adaptive iterative linearization Galerkin methods for non-linear problems , Math. Comp. (2020), no. 326, 2707–2734.[HW20b] , On the convergence of adaptive iterative linearized Galerkin methods , Calcolo (2020), no. 3, Paper No. 24, 23.[Kaˇc59] L. M. Kaˇcanov, Variational methods of solution of plasticity problems , J. Appl. Math.Mech. (1959), 880–883. MR 0112408[Zei88] E. Zeidler, Nonlinear functional analysis and its applications. IV , Springer-Verlag, NewYork, 1988, Applications to mathematical physics, Translated from the German andwith a preface by Juergen Quandt.[Zei90] ,